From andreas at sdsc.edu Fri Sep 2 12:56:05 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 2 Sep 2011 09:56:05 -0700 Subject: [Biojava-l] BioJava 3.0.2 released Message-ID: BioJava 3.0.2 has been released and is available from http://www.biojava.org/wiki/BioJava:Download . BioJava 3.0.2 adds new modules and enhances the capabilities of BioJava: - biojava3-aa-prop: This new module allows the calculation of physico chemical and other properties of protein sequences. - biojava3-protein-disorder: A new module for the prediction of disordered regions in proteins. It based on a Java implementation of the RONN predictor. Other noteworthy improvements: - protein-structure: Improved handling of protein domains: Now with better support for SCOP. New functionality for automated prediction of protein domains, based on Protein Domain Parser. - Improvements and bug fixes in several modules. Currently, up to 8 different people are making commits per month. This gives an indication how active Biojava is being developed. The two new modules are based on the work of Ah Fu (Chuan Hock Koh) and Peter Troshin, which happened around this year's Google Summer of Code. Thanks to everybody who made this new release possible! About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy BioJava-ing, Andreas From er.indupandey at gmail.com Tue Sep 6 14:09:30 2011 From: er.indupandey at gmail.com (Indu Pandey) Date: Tue, 6 Sep 2011 18:09:30 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <208023551.12032366.1315332570025.JavaMail.app@ela4-bed84.prod> I'd like to add you to my professional network on LinkedIn. - Indu Indu Pandey Lecturer at RKGIT,Gzb New Delhi Area, India Confirm that you know Indu Pandey: https://www.linkedin.com/e/triamj-gs971uhy-6f/isd/4103386002/7vQhFbB9/?hs=false&tok=0rYrA-ykaxwkU1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/triamj-gs971uhy-6f/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/goo/biojava-l%40lists%2Eopen-bio%2Eorg/20061/I1415748312_1/?hs=false&tok=0NEyozg4CxwkU1 (c) 2011 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. From jayunit100 at gmail.com Thu Sep 15 23:32:13 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Thu, 15 Sep 2011 23:32:13 -0400 Subject: [Biojava-l] atomcache with a file Message-ID: Hi guys : Anyone want to share a code snippet to use AtomCache to load a PDB File from disk ? -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Thu Sep 15 23:52:01 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 15 Sep 2011 20:52:01 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Hi Jay, you can use it like this: // by default PDB files will be stored in a temporary directory // there are two ways of configuring a directory, that can get re-used multiple times: // A) set the environment variable PDB_DIR // B) call cache.setPath(path) AtomCache cache = new AtomCache(); try { // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 Structure s = cache.getStructure("4hhb"); System.out.println(s); } catch (Exception e) { e.printStackTrace(); } Hope that helps, Andreas On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: > Hi guys : Anyone want to share a code snippet to use AtomCache to load a PDB > File from disk ? > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jayunit100 at gmail.com Fri Sep 16 00:28:16 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Fri, 16 Sep 2011 00:28:16 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Thanks... but I have a directory of pdb files /Users/jay/pdb/a1.pdb Is it possible for atom cache to initially load the file from this directory ? I don't care where it caches the data ... Its just that my pdb file is not at RCSB, and it appears that atomcache is set up to go to RCSB by default to find a pdb file. On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: > Hi Jay, > > you can use it like this: > > // by default PDB files will be stored in a temporary > directory > // there are two ways of configuring a directory, that can > get > re-used multiple times: > // A) set the environment variable PDB_DIR > // B) call cache.setPath(path) > AtomCache cache = new AtomCache(); > > try { > // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 > Structure s = cache.getStructure("4hhb"); > System.out.println(s); > } catch (Exception e) { > > e.printStackTrace(); > > } > > Hope that helps, > > Andreas > > > > On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: > > Hi guys : Anyone want to share a code snippet to use AtomCache to load a > PDB > > File from disk ? > > > > -- > > Jay Vyas > > MMSB/UCHC > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- Jay Vyas MMSB/UCHC From jayunit100 at gmail.com Fri Sep 16 01:55:47 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Fri, 16 Sep 2011 01:55:47 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: nevermind, I guess I can just use the PDBFileReader and store that in the atomcache. Figured maybe there was a shortcut. From andreas at sdsc.edu Fri Sep 16 10:43:18 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Sep 2011 07:43:18 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: The AtomCache is built around naming conventions and supports more than just loading PDB files. It can also load the representation of a SCOP domain as a Structure object (and some other things). If you think it is useful we can add something like a pre-fix which would tell the cache to load local files with a nonstandard name. Something like : private:a1 could be the name to describe your own file. Andreas On Thu, Sep 15, 2011 at 9:28 PM, Jay Vyas wrote: > Thanks... but I have a directory of pdb files > > /Users/jay/pdb/a1.pdb > > Is it possible for atom cache to initially load the file from this directory > ? > > I don't care where it caches the data ...? Its just that my pdb file is not > at RCSB, and it appears that atomcache > is set up to go to RCSB by default to find a pdb file. > > On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: >> >> Hi Jay, >> >> you can use it like this: >> >> ? ? ? ? ? ? ? ?// by default PDB files will be stored in a temporary >> directory >> ? ? ? ? ? ? ? ?// there are two ways of configuring a directory, that can >> get >> re-used multiple times: >> ? ? ? ? ? ? ? ?// A) set the environment variable PDB_DIR >> ? ? ? ? ? ? ? ?// B) call cache.setPath(path) >> ? ? ? ? ? ? ? ?AtomCache cache = new AtomCache(); >> >> ? ? ? ? ? ? ? ?try { >> ? ? ? ? ? ? ? ? ? ? ? ?// alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 >> ? ? ? ? ? ? ? ? ? ? ? ?Structure s = cache.getStructure("4hhb"); >> ? ? ? ? ? ? ? ? ? ? ? ?System.out.println(s); >> ? ? ? ? ? ? ? ?} ?catch (Exception e) { >> >> ? ? ? ? ? ? ? ? ? ? ? ?e.printStackTrace(); >> >> ? ? ? ? ? ? ? ?} >> >> Hope that helps, >> >> Andreas >> >> >> >> On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: >> > Hi guys : Anyone want to share a code snippet to use AtomCache to load a >> > PDB >> > File from disk ? >> > >> > -- >> > Jay Vyas >> > MMSB/UCHC >> > _______________________________________________ >> > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > > -- > Jay Vyas > MMSB/UCHC > From amr_alhossary at hotmail.com Fri Sep 16 11:10:04 2011 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Fri, 16 Sep 2011 17:10:04 +0200 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: AtomCache searches for files in a standard naming format currently, you can imitate that standard format, by renaming your local files, and then they will be loaded just like normally. Amr -------------------------------------------------- From: "Andreas Prlic" Sent: Friday, September 16, 2011 4:43 PM To: "Jay Vyas" Cc: Subject: Re: [Biojava-l] atomcache with a file > The AtomCache is built around naming conventions and supports more > than just loading PDB files. It can also load the representation of a > SCOP domain as a Structure object (and some other things). If you > think it is useful we can add something like a pre-fix which would > tell the cache to load local files with a nonstandard name. Something > like : private:a1 could be the name to describe your own file. > > Andreas > > On Thu, Sep 15, 2011 at 9:28 PM, Jay Vyas wrote: >> Thanks... but I have a directory of pdb files >> >> /Users/jay/pdb/a1.pdb >> >> Is it possible for atom cache to initially load the file from this >> directory >> ? >> >> I don't care where it caches the data ... Its just that my pdb file is >> not >> at RCSB, and it appears that atomcache >> is set up to go to RCSB by default to find a pdb file. >> >> On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: >>> >>> Hi Jay, >>> >>> you can use it like this: >>> >>> // by default PDB files will be stored in a temporary >>> directory >>> // there are two ways of configuring a directory, that can >>> get >>> re-used multiple times: >>> // A) set the environment variable PDB_DIR >>> // B) call cache.setPath(path) >>> AtomCache cache = new AtomCache(); >>> >>> try { >>> // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 >>> Structure s = cache.getStructure("4hhb"); >>> System.out.println(s); >>> } catch (Exception e) { >>> >>> e.printStackTrace(); >>> >>> } >>> >>> Hope that helps, >>> >>> Andreas >>> >>> >>> >>> On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: >>> > Hi guys : Anyone want to share a code snippet to use AtomCache to load >>> > a >>> > PDB >>> > File from disk ? >>> > >>> > -- >>> > Jay Vyas >>> > MMSB/UCHC >>> > _______________________________________________ >>> > Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> > http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > >> >> >> >> -- >> Jay Vyas >> MMSB/UCHC >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From sbliven at ucsd.edu Fri Sep 16 14:13:56 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Fri, 16 Sep 2011 11:13:56 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: I think that AtomCache is the wrong tool for this job. If you have private PDB files you should just use a PDBFileReader to get each one (see http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). I don't think AtomCache should be modified to support this. -Spencer On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: > If you > think it is useful we can add something like a pre-fix which would > tell the cache to load local files with a nonstandard name. Something > like : private:a1 could be the name to describe your own file. > From andreas at sdsc.edu Fri Sep 16 18:57:36 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Sep 2011 15:57:36 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Hi, had a quick offline discussion with Spencer about this: The AtomCache provides a local mirror for public data, expects a specific organisation of files and only supports one toplevel-path under which all data is stored. If we would add support for private files to it, public and personal files would end mixed up in the same location, which in all likelihood would cause confusion. As such it makes more sense to access your personal files with the PDBFileReader and manage private files yourself. It is a one-liner in both classes anyways... Andreas On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven wrote: > I think that AtomCache is the wrong tool for this job. If you have private > PDB files you should just use a PDBFileReader to get each one (see > http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). > I don't think AtomCache should be modified to support this. > > -Spencer > > On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: >> >> If you >> think it is useful we can add something like a pre-fix which would >> tell the cache to load local files with a nonstandard name. Something >> like : private:a1 could be the name to describe your own file. > > From jayunit100 at gmail.com Fri Sep 16 20:15:27 2011 From: jayunit100 at gmail.com (JAX) Date: Fri, 16 Sep 2011 20:15:27 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Ok cool... Maybe it we be good to make the atom cache java doc should explicitly state that it is only for PDB files, and not any other structures. Jay Vyas MMSB UCHC On Sep 16, 2011, at 6:57 PM, Andreas Prlic wrote: > Hi, > > had a quick offline discussion with Spencer about this: The AtomCache > provides a local mirror for public data, expects a specific > organisation of files and only supports one toplevel-path under which > all data is stored. If we would add support for private files to it, > public and personal files would end mixed up in the same location, > which in all likelihood would cause confusion. As such it makes more > sense to access your personal files with the PDBFileReader and manage > private files yourself. It is a one-liner in both classes anyways... > > Andreas > > > > > On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven wrote: >> I think that AtomCache is the wrong tool for this job. If you have private >> PDB files you should just use a PDBFileReader to get each one (see >> http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). >> I don't think AtomCache should be modified to support this. >> >> -Spencer >> >> On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: >>> >>> If you >>> think it is useful we can add something like a pre-fix which would >>> tell the cache to load local files with a nonstandard name. Something >>> like : private:a1 could be the name to describe your own file. >> >> From dasarnow at gmail.com Sat Sep 17 18:05:19 2011 From: dasarnow at gmail.com (Daniel Asarnow) Date: Sat, 17 Sep 2011 15:05:19 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Jay, The AtomCache also gives a performance boost when re-reading PDB files by using a cached IO mode. I believe you can manually activate this mode for your PDBFileReader; see line 115 of AtomCache class for the example. I'm not completely sure on this, but I think any time you are re-reading the same structure within the same JVM instance there is an advantage. -da On Fri, Sep 16, 2011 at 17:15, JAX wrote: > Ok cool... Maybe it we be good to make the atom cache java doc should > explicitly state that it is only for PDB files, and not any other > structures. > > Jay Vyas > MMSB > UCHC > > On Sep 16, 2011, at 6:57 PM, Andreas Prlic wrote: > > > Hi, > > > > had a quick offline discussion with Spencer about this: The AtomCache > > provides a local mirror for public data, expects a specific > > organisation of files and only supports one toplevel-path under which > > all data is stored. If we would add support for private files to it, > > public and personal files would end mixed up in the same location, > > which in all likelihood would cause confusion. As such it makes more > > sense to access your personal files with the PDBFileReader and manage > > private files yourself. It is a one-liner in both classes anyways... > > > > Andreas > > > > > > > > > > On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven > wrote: > >> I think that AtomCache is the wrong tool for this job. If you have > private > >> PDB files you should just use a PDBFileReader to get each one (see > >> > http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control > ). > >> I don't think AtomCache should be modified to support this. > >> > >> -Spencer > >> > >> On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: > >>> > >>> If you > >>> think it is useful we can add something like a pre-fix which would > >>> tell the cache to load local files with a nonstandard name. Something > >>> like : private:a1 could be the name to describe your own file. > >> > >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Sun Sep 18 19:50:27 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Sep 2011 16:50:27 -0700 Subject: [Biojava-l] [Biojava-dev] A question about multiple alignment In-Reply-To: <20110917182831.74436c7k6ttd1u04@www.nexusmail.uwaterloo.ca> References: <20110916135621.148774hjgujhkmos@www.nexusmail.uwaterloo.ca> <20110917182831.74436c7k6ttd1u04@www.nexusmail.uwaterloo.ca> Message-ID: Hi Shahab, Sounds like you want to use an identity matrix for the alignment.. Andreas On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: > Thanks Andreas, > I want two components that have different names to have 0 alignment score. > My application is not about bio-compounds,so I can use anything else rather > than ProteinSequence and AminoAcidCompound. I just need to align sequences > of arbitrary alphabets. Could you suggest me a solution please? > Thanks a lot, > Shahab > > Quoting Andreas Prlic : > >> Hi Shahab, >> >> did you take a look at the substitution matrix, if it is scoring your >> sequences according to your expectation? Looks like in your >> theoretical example the alignment of B and D is favorable, i.e. it has >> a positive alignment score.. >> >> Andreas >> >> >> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >> wrote: >>> >>> Hi, >>> I am using BioJava in a pattern mining project. I want to align a set of >>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>> "ADE"). >>> >>> This is a part of my code: >>> >>> SubstitutionMatrix matrix = new >>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>> GuideTree gt = new >>> GuideTree>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>> ? ? ? ? ? ?Profile profile = >>> >>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>> new SimpleGapPenalty((short)0,(short)0),matrix); >>> >>> The result of the above code is: >>> ABCE >>> ABCE >>> AD-E >>> AD-E >>> >>> But what I need is >>> A-BCE >>> A-BCE >>> AD--E >>> AD--E >>> or >>> ABC-E >>> ABC-E >>> A--DE >>> A--DE >>> >>> Do you have any suggestion? >>> Thanks, >>> Shahab >>> >>> >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>> >> > > > > > From su24 at st-andrews.ac.uk Mon Sep 19 06:09:46 2011 From: su24 at st-andrews.ac.uk (Saif Ur-Rehman) Date: Mon, 19 Sep 2011 11:09:46 +0100 Subject: [Biojava-l] UniprotParser Message-ID: Dear all, I am having issues with the BioJava UniProt parser as detailed below: Code: BufferedReader br = new BufferedReader(new FileReader( files[index])); Namespace ns = RichObjectFactory.getDefaultNamespace(); RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); while(iterator.hasNext()) { try { RichSequence rs=iterator.nextRichSequence(); } catch (NoSuchElementException e) { } catch (BioException e) { e.printStackTrace(); } The file I am using is downloaded from the link: ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz The problem is that the parser works for a subset of the IDs within the file and on others throws an exception. Sample Exception stack trace: *** Start of trace ************************* at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) at uniprot.mp.main(mp.java:161) Caused by: org.biojava.bio.seq.io.ParseException: A Exception Has Occurred During Parsing. Please submit the details that follow to biojava-l at biojava.org or post a bug report to http://bugzilla.open-bio.org/ Format_object=org.biojavax.bio.seq.io.UniProtFormat Accession=P53031 Id= Comments= Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC 2512;RX MEDLINE=97082501; PubMed=8923737; DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative regulatory proteins from Candida tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). Stack trace follows .... at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 1 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) ... 2 more org.biojava.bio.BioException: Could not read sequence at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) at uniprot.mp.main(mp.java:161) Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied ********End of trace********************************** An example of an Id that worked is: ZYM1_SCHPO while an ID that didn't work is: ZUO1_YEAST Thanks a lot in advance. Cheers, Saif -- Saif Ur-Rehman Centre for Evolution, Genes and Genomics Harold Mitchell Building University of St Andrews St Andrews Fife KY16 9TH UK Tel: +44 131 5572556 Fax: +44 1334 463366 From khalil.elmazouari at gmail.com Mon Sep 19 12:35:07 2011 From: khalil.elmazouari at gmail.com (Khalil El Mazouari) Date: Mon, 19 Sep 2011 18:35:07 +0200 Subject: [Biojava-l] Biojava-l Digest, Vol 104, Issue 6 In-Reply-To: References: Message-ID: Hi take a look at http://en.wikipedia.org/wiki/Levenshtein_distance Regards, khalil On 19 Sep 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: [Biojava-dev] A question about multiple alignment > (Andreas Prlic) > 2. UniprotParser (Saif Ur-Rehman) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 18 Sep 2011 16:50:27 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] [Biojava-dev] A question about multiple > alignment > To: Shahab Kamali > Cc: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Shahab, > > Sounds like you want to use an identity matrix for the alignment.. > > Andreas > > On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: >> Thanks Andreas, >> I want two components that have different names to have 0 alignment score. >> My application is not about bio-compounds,so I can use anything else rather >> than ProteinSequence and AminoAcidCompound. I just need to align sequences >> of arbitrary alphabets. Could you suggest me a solution please? >> Thanks a lot, >> Shahab >> >> Quoting Andreas Prlic : >> >>> Hi Shahab, >>> >>> did you take a look at the substitution matrix, if it is scoring your >>> sequences according to your expectation? Looks like in your >>> theoretical example the alignment of B and D is favorable, i.e. it has >>> a positive alignment score.. >>> >>> Andreas >>> >>> >>> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >>> wrote: >>>> >>>> Hi, >>>> I am using BioJava in a pattern mining project. I want to align a set of >>>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>>> "ADE"). >>>> >>>> This is a part of my code: >>>> >>>> SubstitutionMatrix matrix = new >>>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>>> GuideTree gt = new >>>> GuideTree>>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>>> ? ? ? ? ? ?Profile profile = >>>> >>>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>>> new SimpleGapPenalty((short)0,(short)0),matrix); >>>> >>>> The result of the above code is: >>>> ABCE >>>> ABCE >>>> AD-E >>>> AD-E >>>> >>>> But what I need is >>>> A-BCE >>>> A-BCE >>>> AD--E >>>> AD--E >>>> or >>>> ABC-E >>>> ABC-E >>>> A--DE >>>> A--DE >>>> >>>> Do you have any suggestion? >>>> Thanks, >>>> Shahab >>>> >>>> >>>> >>>> _______________________________________________ >>>> biojava-dev mailing list >>>> biojava-dev at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>>> >>> >> >> >> >> >> > > > > ------------------------------ > > Message: 2 > Date: Mon, 19 Sep 2011 11:09:46 +0100 > From: Saif Ur-Rehman > Subject: [Biojava-l] UniprotParser > To: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Dear all, > > I am having issues with the BioJava UniProt parser as detailed below: > > Code: > > BufferedReader br = new BufferedReader(new FileReader( files[index])); > Namespace ns = RichObjectFactory.getDefaultNamespace(); > RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); > while(iterator.hasNext()) > { > try > { > RichSequence rs=iterator.nextRichSequence(); > } > > catch (NoSuchElementException e) > { > > } > catch (BioException e) > { > e.printStackTrace(); > } > > > > > The file I am using is downloaded from the link: > > ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz > > > The problem is that the parser works for a subset of the IDs within the file > and on others throws an exception. > > Sample Exception stack trace: > > *** Start of trace ************************* > > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) > at uniprot.mp.main(mp.java:161) > Caused by: org.biojava.bio.seq.io.ParseException: > > A Exception Has Occurred During Parsing. > Please submit the details that follow to biojava-l at biojava.org or post a bug > report to http://bugzilla.open-bio.org/ > > Format_object=org.biojavax.bio.seq.io.UniProtFormat > Accession=P53031 > Id= > Comments= > Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC > 2512;RX MEDLINE=97082501; PubMed=8923737; > DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA > Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative > regulatory proteins from Candida > tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). > Stack trace follows .... > > > at > org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) > ... 1 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) > ... 2 more > org.biojava.bio.BioException: Could not read sequence > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) > at uniprot.mp.main(mp.java:161) > Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied > > ********End of trace********************************** > > An example of an Id that worked is: > > ZYM1_SCHPO > > while an ID that didn't work is: > > ZUO1_YEAST > > Thanks a lot in advance. > > Cheers, > Saif > > > -- > Saif Ur-Rehman > > Centre for Evolution, Genes and Genomics > Harold Mitchell Building > University of St Andrews > St Andrews > Fife > KY16 9TH > UK > > Tel: +44 131 5572556 > Fax: +44 1334 463366 > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 104, Issue 6 > ***************************************** From guoer713108 at gmail.com Tue Sep 20 00:18:55 2011 From: guoer713108 at gmail.com (quan zou) Date: Tue, 20 Sep 2011 12:18:55 +0800 Subject: [Biojava-l] why can't biojava fold RNA? Message-ID: Dear all, Is there any java program or jar which can fold a RNA sequence to a secondary structure? Such as RNAfold? Why RNAfold/ Vienna Package have not been contained in Biojava? Quan From andreas at sdsc.edu Tue Sep 20 11:11:58 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 20 Sep 2011 08:11:58 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: References: Message-ID: If all your code is in Java and you have binaries for some external software you can easily wrap it from Java and trigger the execution. Andreas On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: > Thanks, however, there is no java code. it cannot be imported into my java > project. > > 2011/9/20 Andreas Prlic >> >> Hi Quan, >> >> the Vienna RNA package is available as open source. ?Did you take a look >> at it? >> >> Andreas >> >> >> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >> > Dear all, >> > >> > ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >> > a >> > secondary structure? Such as RNAfold? >> > >> > ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >> > >> > ? ? ? ? ? ? ? ? Quan >> > _______________________________________________ >> > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > From jayunit100 at gmail.com Tue Sep 20 12:09:29 2011 From: jayunit100 at gmail.com (JAX) Date: Tue, 20 Sep 2011 12:09:29 -0400 Subject: [Biojava-l] Biojava-l Digest, Vol 104, Issue 7 In-Reply-To: References: Message-ID: <2E47CB3E-5859-47E7-846A-50618E95F925@gmail.com> pairwise similarity is better than levenstein for short sequences..... Just count the total number of matching letter pairs, divided by the length of the longest string between the two words. There is a great article about this online called "How to strike a match". We used it for the sequence mining here, and were able to find important homologs and reproduce known results : http://jb.asm.org/cgi/content/short/JB.00018-11v1 Jay Vyas MMSB UCHC On Sep 20, 2011, at 12:00 PM, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: Biojava-l Digest, Vol 104, Issue 6 (Khalil El Mazouari) > 2. why can't biojava fold RNA? (quan zou) > 3. Re: why can't biojava fold RNA? (Andreas Prlic) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 19 Sep 2011 18:35:07 +0200 > From: Khalil El Mazouari > Subject: Re: [Biojava-l] Biojava-l Digest, Vol 104, Issue 6 > To: biojava-l at lists.open-bio.org > Message-ID: > Content-Type: text/plain; charset=us-ascii > > Hi > > take a look at http://en.wikipedia.org/wiki/Levenshtein_distance > > Regards, > > khalil > > > > On 19 Sep 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > >> Send Biojava-l mailing list submissions to >> biojava-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> or, via email, send a message with subject or body 'help' to >> biojava-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> biojava-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Biojava-l digest..." >> >> >> Today's Topics: >> >> 1. Re: [Biojava-dev] A question about multiple alignment >> (Andreas Prlic) >> 2. UniprotParser (Saif Ur-Rehman) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Sun, 18 Sep 2011 16:50:27 -0700 >> From: Andreas Prlic >> Subject: Re: [Biojava-l] [Biojava-dev] A question about multiple >> alignment >> To: Shahab Kamali >> Cc: biojava-l at biojava.org >> Message-ID: >> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hi Shahab, >> >> Sounds like you want to use an identity matrix for the alignment.. >> >> Andreas >> >> On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: >>> Thanks Andreas, >>> I want two components that have different names to have 0 alignment score. >>> My application is not about bio-compounds,so I can use anything else rather >>> than ProteinSequence and AminoAcidCompound. I just need to align sequences >>> of arbitrary alphabets. Could you suggest me a solution please? >>> Thanks a lot, >>> Shahab >>> >>> Quoting Andreas Prlic : >>> >>>> Hi Shahab, >>>> >>>> did you take a look at the substitution matrix, if it is scoring your >>>> sequences according to your expectation? Looks like in your >>>> theoretical example the alignment of B and D is favorable, i.e. it has >>>> a positive alignment score.. >>>> >>>> Andreas >>>> >>>> >>>> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >>>> wrote: >>>>> >>>>> Hi, >>>>> I am using BioJava in a pattern mining project. I want to align a set of >>>>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>>>> "ADE"). >>>>> >>>>> This is a part of my code: >>>>> >>>>> SubstitutionMatrix matrix = new >>>>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>>>> GuideTree gt = new >>>>> GuideTree>>>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>>>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>>>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>>>> ? ? ? ? ? ?Profile profile = >>>>> >>>>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>>>> new SimpleGapPenalty((short)0,(short)0),matrix); >>>>> >>>>> The result of the above code is: >>>>> ABCE >>>>> ABCE >>>>> AD-E >>>>> AD-E >>>>> >>>>> But what I need is >>>>> A-BCE >>>>> A-BCE >>>>> AD--E >>>>> AD--E >>>>> or >>>>> ABC-E >>>>> ABC-E >>>>> A--DE >>>>> A--DE >>>>> >>>>> Do you have any suggestion? >>>>> Thanks, >>>>> Shahab >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> biojava-dev mailing list >>>>> biojava-dev at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>>>> >>>> >>> >>> >>> >>> >>> >> >> >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 19 Sep 2011 11:09:46 +0100 >> From: Saif Ur-Rehman >> Subject: [Biojava-l] UniprotParser >> To: biojava-l at biojava.org >> Message-ID: >> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Dear all, >> >> I am having issues with the BioJava UniProt parser as detailed below: >> >> Code: >> >> BufferedReader br = new BufferedReader(new FileReader( files[index])); >> Namespace ns = RichObjectFactory.getDefaultNamespace(); >> RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); >> while(iterator.hasNext()) >> { >> try >> { >> RichSequence rs=iterator.nextRichSequence(); >> } >> >> catch (NoSuchElementException e) >> { >> >> } >> catch (BioException e) >> { >> e.printStackTrace(); >> } >> >> >> >> >> The file I am using is downloaded from the link: >> >> ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz >> >> >> The problem is that the parser works for a subset of the IDs within the file >> and on others throws an exception. >> >> Sample Exception stack trace: >> >> *** Start of trace ************************* >> >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) >> at uniprot.mp.main(mp.java:161) >> Caused by: org.biojava.bio.seq.io.ParseException: >> >> A Exception Has Occurred During Parsing. >> Please submit the details that follow to biojava-l at biojava.org or post a bug >> report to http://bugzilla.open-bio.org/ >> >> Format_object=org.biojavax.bio.seq.io.UniProtFormat >> Accession=P53031 >> Id= >> Comments= >> Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC >> 2512;RX MEDLINE=97082501; PubMed=8923737; >> DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA >> Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative >> regulatory proteins from Candida >> tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). >> Stack trace follows .... >> >> >> at >> org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) >> ... 1 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 >> at >> org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) >> ... 2 more >> org.biojava.bio.BioException: Could not read sequence >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) >> at uniprot.mp.main(mp.java:161) >> Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied >> >> ********End of trace********************************** >> >> An example of an Id that worked is: >> >> ZYM1_SCHPO >> >> while an ID that didn't work is: >> >> ZUO1_YEAST >> >> Thanks a lot in advance. >> >> Cheers, >> Saif >> >> >> -- >> Saif Ur-Rehman >> >> Centre for Evolution, Genes and Genomics >> Harold Mitchell Building >> University of St Andrews >> St Andrews >> Fife >> KY16 9TH >> UK >> >> Tel: +44 131 5572556 >> Fax: +44 1334 463366 >> >> >> ------------------------------ >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> >> End of Biojava-l Digest, Vol 104, Issue 6 >> ***************************************** > > > > > ------------------------------ > > Message: 2 > Date: Tue, 20 Sep 2011 12:18:55 +0800 > From: quan zou > Subject: [Biojava-l] why can't biojava fold RNA? > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Dear all, > > Is there any java program or jar which can fold a RNA sequence to a > secondary structure? Such as RNAfold? > > Why RNAfold/ Vienna Package have not been contained in Biojava? > > Quan > > > ------------------------------ > > Message: 3 > Date: Tue, 20 Sep 2011 08:11:58 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] why can't biojava fold RNA? > To: quan zou > Cc: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > If all your code is in Java and you have binaries for some external > software you can easily wrap it from Java and trigger the execution. > > Andreas > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >> Thanks, however, there is no java code. it cannot be imported into my java >> project. >> >> 2011/9/20 Andreas Prlic >>> >>> Hi Quan, >>> >>> the Vienna RNA package is available as open source. ?Did you take a look >>> at it? >>> >>> Andreas >>> >>> >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>> Dear all, >>>> >>>> ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >>>> a >>>> secondary structure? Such as RNAfold? >>>> >>>> ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >>>> >>>> ? ? ? ? ? ? ? ? Quan >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 104, Issue 7 > ***************************************** From daniel.quest at gmail.com Tue Sep 20 21:14:12 2011 From: daniel.quest at gmail.com (Daniel Quest) Date: Tue, 20 Sep 2011 21:14:12 -0400 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: References: Message-ID: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> I don't quite think this answers the question. If you want to execute c/c++/fortran/legacy code from java, you can do a system exec from within java. Or you can play around with jni but my experiences with that have not been good Does biojava have the ability to execute a stand alone program? If not I have some code lying around you guys can have Daniel Sent from my iPod On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > If all your code is in Java and you have binaries for some external > software you can easily wrap it from Java and trigger the execution. > > Andreas > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >> Thanks, however, there is no java code. it cannot be imported into my java >> project. >> >> 2011/9/20 Andreas Prlic >>> >>> Hi Quan, >>> >>> the Vienna RNA package is available as open source. Did you take a look >>> at it? >>> >>> Andreas >>> >>> >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>> Dear all, >>>> >>>> Is there any java program or jar which can fold a RNA sequence to >>>> a >>>> secondary structure? Such as RNAfold? >>>> >>>> Why RNAfold/ Vienna Package have not been contained in Biojava? >>>> >>>> Quan >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From phidias51 at gmail.com Tue Sep 20 21:33:56 2011 From: phidias51 at gmail.com (Mark Fortner) Date: Tue, 20 Sep 2011 18:33:56 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> References: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> Message-ID: You might try JNA instead of JNI. It's easier to use. However, I'd be reticent about calling it frequently in a loop, as there is some overhead involved. Mark On Tue, Sep 20, 2011 at 6:14 PM, Daniel Quest wrote: > I don't quite think this answers the question. > > If you want to execute c/c++/fortran/legacy code from java, you can do a > system exec from within java. Or you can play around with jni but my > experiences with that have not been good > > Does biojava have the ability to execute a stand alone program? If not I > have some code lying around you guys can have > > Daniel > > Sent from my iPod > > On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > > > If all your code is in Java and you have binaries for some external > > software you can easily wrap it from Java and trigger the execution. > > > > Andreas > > > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: > >> Thanks, however, there is no java code. it cannot be imported into my > java > >> project. > >> > >> 2011/9/20 Andreas Prlic > >>> > >>> Hi Quan, > >>> > >>> the Vienna RNA package is available as open source. Did you take a > look > >>> at it? > >>> > >>> Andreas > >>> > >>> > >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou > wrote: > >>>> Dear all, > >>>> > >>>> Is there any java program or jar which can fold a RNA sequence > to > >>>> a > >>>> secondary structure? Such as RNAfold? > >>>> > >>>> Why RNAfold/ Vienna Package have not been contained in Biojava? > >>>> > >>>> Quan > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>>> > >> > >> > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Tue Sep 20 21:55:41 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 20 Sep 2011 18:55:41 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> References: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> Message-ID: A simple way to run an external program is via ProcessBuilder (since Java 1.5) http://download.oracle.com/javase/1.5.0/docs/api/java/lang/ProcessBuilder.html Andreas On Tue, Sep 20, 2011 at 6:14 PM, Daniel Quest wrote: > I don't quite think this answers the question. > > If you want to execute c/c++/fortran/legacy code from java, you can do a system exec from within java. ?Or you can play around with jni but my experiences with that have not been good > > Does biojava have the ability to execute a stand alone program? ?If not I have some code lying around you guys can have > > Daniel > > Sent from my iPod > > On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > >> If all your code is in Java and you have binaries for some external >> software you can easily wrap it from Java and trigger the execution. >> >> Andreas >> >> On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >>> Thanks, however, there is no java code. it cannot be imported into my java >>> project. >>> >>> 2011/9/20 Andreas Prlic >>>> >>>> Hi Quan, >>>> >>>> the Vienna RNA package is available as open source. ?Did you take a look >>>> at it? >>>> >>>> Andreas >>>> >>>> >>>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>>> Dear all, >>>>> >>>>> ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >>>>> a >>>>> secondary structure? Such as RNAfold? >>>>> >>>>> ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >>>>> >>>>> ? ? ? ? ? ? ? ? Quan >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>> >>> >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > From hk3 at sanger.ac.uk Wed Sep 21 04:36:35 2011 From: hk3 at sanger.ac.uk (Hashem Koohy) Date: Wed, 21 Sep 2011 09:36:35 +0100 Subject: [Biojava-l] NullPointerException in Hidden Markov Model Message-ID: Hi, I have set up a HMM model and I am trying to get the viterbi path printed out, instead I get the following error message. I feel it must be some thing silly but I cannot spot it. I really appreciate any clue. Exception in thread "main" java.lang.NullPointerException at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:648) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:512) at hmmwithdirichletprior.VITERBI2.main(VITERBI2.java:188) This is how I call my HMM model from main function: MarkovModel mm = block.makeMarkovModel(observedSeqAlphabet, tranProb, strProb, statesAndDirPar, "dirichletMM"); DP dp = new SingleDP(mm); SymbolList [] symList = {symbolList}; StatePath viterbiPath = dp.viterbi(symList, ScoreType.PROBABILITY); And here is my makeMarkovModel method: private static MarkovModel makeMarkovModel( SimpleAlphabet alphabet, double [][] transitionMatrix, double [] startProbabilities, LinkedHashMap statesAndCorresponingDirichletParameters, String modelName ) throws Exception{ SimpleMarkovModel mm = new SimpleMarkovModel(1, alphabet, modelName ); int [] advance = { 1 }; int numberOfStates = statesAndCorresponingDirichletParameters.size(); ArrayList stateNames = new ArrayList(); ArrayList arraysOfDirichletParameters = new ArrayList(); for(Map.Entry me:statesAndCorresponingDirichletParameters.entrySet() ){ double oneDirichletPar [] = me.getValue(); arraysOfDirichletParameters.add(oneDirichletPar); String oneState = me.getKey(); stateNames.add(oneState); } //Distribution initiation Distribution [] dists = new Distribution[numberOfStates]; EmissionState [] emissionStates = new SimpleEmissionState[numberOfStates]; for(int i = 0; i< numberOfStates;i++){ dists[i] = DistributionFactory.DEFAULT.createDistribution(alphabet); String oneState = stateNames.get(i); emissionStates[i] = new SimpleEmissionState(oneState, Annotation.EMPTY_ANNOTATION,advance,dists[i] ); } //add states to the model for(State s:emissionStates ){ try{ mm.addState(s); } catch(Exception e){ throw new Exception("Can't add states to model!"); } } //create transitions State magic = mm.magicalState(); for(State i:emissionStates ){ mm.createTransition(magic, i); for(State j: emissionStates){ mm.createTransition(i, j); } } //set up emission scores for(Iterator i = alphabet.iterator(); i.hasNext();){ AtomicSymbol oneSym = (AtomicSymbol) i.next(); double [] symbolsInThisSymbolAsArrayOfDoubles = makeArrayOfDoublesFromASymbol(oneSym); for(int d =0 ; d< dists.length; d++){ double dirichletPar [] = arraysOfDirichletParameters.get(d); double oneDensity = DirichletDist.density(dirichletPar, symbolsInThisSymbolAsArrayOfDoubles); dists[d].setWeight(oneSym,oneDensity ); } } //set transition scores Distribution transDist; //magical to others transDist = mm.getWeights(mm.magicalState()); for(int i=0; i Hi, I just committed a couple of new features related to protein structure alignments and working with protein domains: - better support for SCOP domains: - rather than using a local SCOP installation the default is now to fetch SCOP domain data via remote web service calls (much more memory friendly) - the user interface for structure alignments now has a new auto-suggest panel that makes it easier to enter SCOP domain IDs - structures that don't have SCOP domains assigned, can get automatically split into domains with ProteinDomainParser (also can be fetched from remote) - database searches now display % sequence ID in the alignment as a new column. Andreas From member at linkedin.com Thu Sep 29 05:46:20 2011 From: member at linkedin.com (Huijie Qiao via LinkedIn) Date: Thu, 29 Sep 2011 09:46:20 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <1746095084.5009685.1317289580465.JavaMail.app@ela4-bed79.prod> LinkedIn ------------ Huijie Qiao requested to add you as a connection on LinkedIn: ------------------------------------------ Christopher, I'd like to add you to my professional network on LinkedIn. Accept invitation from Huijie Qiao http://www.linkedin.com/e/triamj-gt5k7df2-1n/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I2866293_16/6lColZJrmZznQNdhjRQnOpBtn9QfmhBt71BoSd1p65Lr6lOfPoNnPcVczoSe399bQtmrkhhdlhObPsRd38Ve3gVcPgLrCBxbOYWrSlI/EML_comm_afe/?hs=false&tok=0zua711XEa0kY1 View invitation from Huijie Qiao http://www.linkedin.com/e/triamj-gt5k7df2-1n/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I2866293_16/dz5vcPAOdzoUcAALqnpPbOYWrSlI/svi/?hs=false&tok=2TiMaI0VQa0kY1 ------------------------------------------ Why might connecting with Huijie Qiao be a good idea? Have a question? Huijie Qiao's network will probably have an answer: You can use LinkedIn Answers to distribute your professional questions to Huijie Qiao and your extended network. You can get high-quality answers from experienced professionals. http://www.linkedin.com/e/triamj-gt5k7df2-1n/ash/inv19_ayn/?hs=false&tok=1sMGVLd3Ia0kY1 -- (c) 2011, LinkedIn Corporation From andreas at sdsc.edu Fri Sep 2 16:56:05 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 2 Sep 2011 09:56:05 -0700 Subject: [Biojava-l] BioJava 3.0.2 released Message-ID: BioJava 3.0.2 has been released and is available from http://www.biojava.org/wiki/BioJava:Download . BioJava 3.0.2 adds new modules and enhances the capabilities of BioJava: - biojava3-aa-prop: This new module allows the calculation of physico chemical and other properties of protein sequences. - biojava3-protein-disorder: A new module for the prediction of disordered regions in proteins. It based on a Java implementation of the RONN predictor. Other noteworthy improvements: - protein-structure: Improved handling of protein domains: Now with better support for SCOP. New functionality for automated prediction of protein domains, based on Protein Domain Parser. - Improvements and bug fixes in several modules. Currently, up to 8 different people are making commits per month. This gives an indication how active Biojava is being developed. The two new modules are based on the work of Ah Fu (Chuan Hock Koh) and Peter Troshin, which happened around this year's Google Summer of Code. Thanks to everybody who made this new release possible! About BioJava: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats, and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Happy BioJava-ing, Andreas From er.indupandey at gmail.com Tue Sep 6 18:09:30 2011 From: er.indupandey at gmail.com (Indu Pandey) Date: Tue, 6 Sep 2011 18:09:30 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <208023551.12032366.1315332570025.JavaMail.app@ela4-bed84.prod> I'd like to add you to my professional network on LinkedIn. - Indu Indu Pandey Lecturer at RKGIT,Gzb New Delhi Area, India Confirm that you know Indu Pandey: https://www.linkedin.com/e/triamj-gs971uhy-6f/isd/4103386002/7vQhFbB9/?hs=false&tok=0rYrA-ykaxwkU1 -- You are receiving Invitation to Connect emails. Click to unsubscribe: http://www.linkedin.com/e/triamj-gs971uhy-6f/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/goo/biojava-l%40lists%2Eopen-bio%2Eorg/20061/I1415748312_1/?hs=false&tok=0NEyozg4CxwkU1 (c) 2011 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA. From jayunit100 at gmail.com Fri Sep 16 03:32:13 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Thu, 15 Sep 2011 23:32:13 -0400 Subject: [Biojava-l] atomcache with a file Message-ID: Hi guys : Anyone want to share a code snippet to use AtomCache to load a PDB File from disk ? -- Jay Vyas MMSB/UCHC From andreas at sdsc.edu Fri Sep 16 03:52:01 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Thu, 15 Sep 2011 20:52:01 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Hi Jay, you can use it like this: // by default PDB files will be stored in a temporary directory // there are two ways of configuring a directory, that can get re-used multiple times: // A) set the environment variable PDB_DIR // B) call cache.setPath(path) AtomCache cache = new AtomCache(); try { // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 Structure s = cache.getStructure("4hhb"); System.out.println(s); } catch (Exception e) { e.printStackTrace(); } Hope that helps, Andreas On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: > Hi guys : Anyone want to share a code snippet to use AtomCache to load a PDB > File from disk ? > > -- > Jay Vyas > MMSB/UCHC > _______________________________________________ > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From jayunit100 at gmail.com Fri Sep 16 04:28:16 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Fri, 16 Sep 2011 00:28:16 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Thanks... but I have a directory of pdb files /Users/jay/pdb/a1.pdb Is it possible for atom cache to initially load the file from this directory ? I don't care where it caches the data ... Its just that my pdb file is not at RCSB, and it appears that atomcache is set up to go to RCSB by default to find a pdb file. On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: > Hi Jay, > > you can use it like this: > > // by default PDB files will be stored in a temporary > directory > // there are two ways of configuring a directory, that can > get > re-used multiple times: > // A) set the environment variable PDB_DIR > // B) call cache.setPath(path) > AtomCache cache = new AtomCache(); > > try { > // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 > Structure s = cache.getStructure("4hhb"); > System.out.println(s); > } catch (Exception e) { > > e.printStackTrace(); > > } > > Hope that helps, > > Andreas > > > > On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: > > Hi guys : Anyone want to share a code snippet to use AtomCache to load a > PDB > > File from disk ? > > > > -- > > Jay Vyas > > MMSB/UCHC > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > -- Jay Vyas MMSB/UCHC From jayunit100 at gmail.com Fri Sep 16 05:55:47 2011 From: jayunit100 at gmail.com (Jay Vyas) Date: Fri, 16 Sep 2011 01:55:47 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: nevermind, I guess I can just use the PDBFileReader and store that in the atomcache. Figured maybe there was a shortcut. From andreas at sdsc.edu Fri Sep 16 14:43:18 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Sep 2011 07:43:18 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: The AtomCache is built around naming conventions and supports more than just loading PDB files. It can also load the representation of a SCOP domain as a Structure object (and some other things). If you think it is useful we can add something like a pre-fix which would tell the cache to load local files with a nonstandard name. Something like : private:a1 could be the name to describe your own file. Andreas On Thu, Sep 15, 2011 at 9:28 PM, Jay Vyas wrote: > Thanks... but I have a directory of pdb files > > /Users/jay/pdb/a1.pdb > > Is it possible for atom cache to initially load the file from this directory > ? > > I don't care where it caches the data ...? Its just that my pdb file is not > at RCSB, and it appears that atomcache > is set up to go to RCSB by default to find a pdb file. > > On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: >> >> Hi Jay, >> >> you can use it like this: >> >> ? ? ? ? ? ? ? ?// by default PDB files will be stored in a temporary >> directory >> ? ? ? ? ? ? ? ?// there are two ways of configuring a directory, that can >> get >> re-used multiple times: >> ? ? ? ? ? ? ? ?// A) set the environment variable PDB_DIR >> ? ? ? ? ? ? ? ?// B) call cache.setPath(path) >> ? ? ? ? ? ? ? ?AtomCache cache = new AtomCache(); >> >> ? ? ? ? ? ? ? ?try { >> ? ? ? ? ? ? ? ? ? ? ? ?// alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 >> ? ? ? ? ? ? ? ? ? ? ? ?Structure s = cache.getStructure("4hhb"); >> ? ? ? ? ? ? ? ? ? ? ? ?System.out.println(s); >> ? ? ? ? ? ? ? ?} ?catch (Exception e) { >> >> ? ? ? ? ? ? ? ? ? ? ? ?e.printStackTrace(); >> >> ? ? ? ? ? ? ? ?} >> >> Hope that helps, >> >> Andreas >> >> >> >> On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: >> > Hi guys : Anyone want to share a code snippet to use AtomCache to load a >> > PDB >> > File from disk ? >> > >> > -- >> > Jay Vyas >> > MMSB/UCHC >> > _______________________________________________ >> > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > > > -- > Jay Vyas > MMSB/UCHC > From amr_alhossary at hotmail.com Fri Sep 16 15:10:04 2011 From: amr_alhossary at hotmail.com (Amr AL-Hossary) Date: Fri, 16 Sep 2011 17:10:04 +0200 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: AtomCache searches for files in a standard naming format currently, you can imitate that standard format, by renaming your local files, and then they will be loaded just like normally. Amr -------------------------------------------------- From: "Andreas Prlic" Sent: Friday, September 16, 2011 4:43 PM To: "Jay Vyas" Cc: Subject: Re: [Biojava-l] atomcache with a file > The AtomCache is built around naming conventions and supports more > than just loading PDB files. It can also load the representation of a > SCOP domain as a Structure object (and some other things). If you > think it is useful we can add something like a pre-fix which would > tell the cache to load local files with a nonstandard name. Something > like : private:a1 could be the name to describe your own file. > > Andreas > > On Thu, Sep 15, 2011 at 9:28 PM, Jay Vyas wrote: >> Thanks... but I have a directory of pdb files >> >> /Users/jay/pdb/a1.pdb >> >> Is it possible for atom cache to initially load the file from this >> directory >> ? >> >> I don't care where it caches the data ... Its just that my pdb file is >> not >> at RCSB, and it appears that atomcache >> is set up to go to RCSB by default to find a pdb file. >> >> On Thu, Sep 15, 2011 at 11:52 PM, Andreas Prlic wrote: >>> >>> Hi Jay, >>> >>> you can use it like this: >>> >>> // by default PDB files will be stored in a temporary >>> directory >>> // there are two ways of configuring a directory, that can >>> get >>> re-used multiple times: >>> // A) set the environment variable PDB_DIR >>> // B) call cache.setPath(path) >>> AtomCache cache = new AtomCache(); >>> >>> try { >>> // alternative: try d4hhba_ 4hhb.A 4hhb.A:1-100 >>> Structure s = cache.getStructure("4hhb"); >>> System.out.println(s); >>> } catch (Exception e) { >>> >>> e.printStackTrace(); >>> >>> } >>> >>> Hope that helps, >>> >>> Andreas >>> >>> >>> >>> On Thu, Sep 15, 2011 at 8:32 PM, Jay Vyas wrote: >>> > Hi guys : Anyone want to share a code snippet to use AtomCache to load >>> > a >>> > PDB >>> > File from disk ? >>> > >>> > -- >>> > Jay Vyas >>> > MMSB/UCHC >>> > _______________________________________________ >>> > Biojava-l mailing list - Biojava-l at lists.open-bio.org >>> > http://lists.open-bio.org/mailman/listinfo/biojava-l >>> > >> >> >> >> -- >> Jay Vyas >> MMSB/UCHC >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From sbliven at ucsd.edu Fri Sep 16 18:13:56 2011 From: sbliven at ucsd.edu (Spencer Bliven) Date: Fri, 16 Sep 2011 11:13:56 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: I think that AtomCache is the wrong tool for this job. If you have private PDB files you should just use a PDBFileReader to get each one (see http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). I don't think AtomCache should be modified to support this. -Spencer On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: > If you > think it is useful we can add something like a pre-fix which would > tell the cache to load local files with a nonstandard name. Something > like : private:a1 could be the name to describe your own file. > From andreas at sdsc.edu Fri Sep 16 22:57:36 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Fri, 16 Sep 2011 15:57:36 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Hi, had a quick offline discussion with Spencer about this: The AtomCache provides a local mirror for public data, expects a specific organisation of files and only supports one toplevel-path under which all data is stored. If we would add support for private files to it, public and personal files would end mixed up in the same location, which in all likelihood would cause confusion. As such it makes more sense to access your personal files with the PDBFileReader and manage private files yourself. It is a one-liner in both classes anyways... Andreas On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven wrote: > I think that AtomCache is the wrong tool for this job. If you have private > PDB files you should just use a PDBFileReader to get each one (see > http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). > I don't think AtomCache should be modified to support this. > > -Spencer > > On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: >> >> If you >> think it is useful we can add something like a pre-fix which would >> tell the cache to load local files with a nonstandard name. Something >> like : private:a1 could be the name to describe your own file. > > From jayunit100 at gmail.com Sat Sep 17 00:15:27 2011 From: jayunit100 at gmail.com (JAX) Date: Fri, 16 Sep 2011 20:15:27 -0400 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Ok cool... Maybe it we be good to make the atom cache java doc should explicitly state that it is only for PDB files, and not any other structures. Jay Vyas MMSB UCHC On Sep 16, 2011, at 6:57 PM, Andreas Prlic wrote: > Hi, > > had a quick offline discussion with Spencer about this: The AtomCache > provides a local mirror for public data, expects a specific > organisation of files and only supports one toplevel-path under which > all data is stored. If we would add support for private files to it, > public and personal files would end mixed up in the same location, > which in all likelihood would cause confusion. As such it makes more > sense to access your personal files with the PDBFileReader and manage > private files yourself. It is a one-liner in both classes anyways... > > Andreas > > > > > On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven wrote: >> I think that AtomCache is the wrong tool for this job. If you have private >> PDB files you should just use a PDBFileReader to get each one (see >> http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control). >> I don't think AtomCache should be modified to support this. >> >> -Spencer >> >> On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: >>> >>> If you >>> think it is useful we can add something like a pre-fix which would >>> tell the cache to load local files with a nonstandard name. Something >>> like : private:a1 could be the name to describe your own file. >> >> From dasarnow at gmail.com Sat Sep 17 22:05:19 2011 From: dasarnow at gmail.com (Daniel Asarnow) Date: Sat, 17 Sep 2011 15:05:19 -0700 Subject: [Biojava-l] atomcache with a file In-Reply-To: References: Message-ID: Jay, The AtomCache also gives a performance boost when re-reading PDB files by using a cached IO mode. I believe you can manually activate this mode for your PDBFileReader; see line 115 of AtomCache class for the example. I'm not completely sure on this, but I think any time you are re-reading the same structure within the same JVM instance there is an advantage. -da On Fri, Sep 16, 2011 at 17:15, JAX wrote: > Ok cool... Maybe it we be good to make the atom cache java doc should > explicitly state that it is only for PDB files, and not any other > structures. > > Jay Vyas > MMSB > UCHC > > On Sep 16, 2011, at 6:57 PM, Andreas Prlic wrote: > > > Hi, > > > > had a quick offline discussion with Spencer about this: The AtomCache > > provides a local mirror for public data, expects a specific > > organisation of files and only supports one toplevel-path under which > > all data is stored. If we would add support for private files to it, > > public and personal files would end mixed up in the same location, > > which in all likelihood would cause confusion. As such it makes more > > sense to access your personal files with the PDBFileReader and manage > > private files yourself. It is a one-liner in both classes anyways... > > > > Andreas > > > > > > > > > > On Fri, Sep 16, 2011 at 11:13 AM, Spencer Bliven > wrote: > >> I think that AtomCache is the wrong tool for this job. If you have > private > >> PDB files you should just use a PDBFileReader to get each one (see > >> > http://biojava.org/wiki/BioJava:CookBook:PDB:read3.0#Getting_more_control > ). > >> I don't think AtomCache should be modified to support this. > >> > >> -Spencer > >> > >> On Fri, Sep 16, 2011 at 07:43, Andreas Prlic wrote: > >>> > >>> If you > >>> think it is useful we can add something like a pre-fix which would > >>> tell the cache to load local files with a nonstandard name. Something > >>> like : private:a1 could be the name to describe your own file. > >> > >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Sun Sep 18 23:50:27 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Sun, 18 Sep 2011 16:50:27 -0700 Subject: [Biojava-l] [Biojava-dev] A question about multiple alignment In-Reply-To: <20110917182831.74436c7k6ttd1u04@www.nexusmail.uwaterloo.ca> References: <20110916135621.148774hjgujhkmos@www.nexusmail.uwaterloo.ca> <20110917182831.74436c7k6ttd1u04@www.nexusmail.uwaterloo.ca> Message-ID: Hi Shahab, Sounds like you want to use an identity matrix for the alignment.. Andreas On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: > Thanks Andreas, > I want two components that have different names to have 0 alignment score. > My application is not about bio-compounds,so I can use anything else rather > than ProteinSequence and AminoAcidCompound. I just need to align sequences > of arbitrary alphabets. Could you suggest me a solution please? > Thanks a lot, > Shahab > > Quoting Andreas Prlic : > >> Hi Shahab, >> >> did you take a look at the substitution matrix, if it is scoring your >> sequences according to your expectation? Looks like in your >> theoretical example the alignment of B and D is favorable, i.e. it has >> a positive alignment score.. >> >> Andreas >> >> >> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >> wrote: >>> >>> Hi, >>> I am using BioJava in a pattern mining project. I want to align a set of >>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>> "ADE"). >>> >>> This is a part of my code: >>> >>> SubstitutionMatrix matrix = new >>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>> GuideTree gt = new >>> GuideTree>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>> ? ? ? ? ? ?Profile profile = >>> >>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>> new SimpleGapPenalty((short)0,(short)0),matrix); >>> >>> The result of the above code is: >>> ABCE >>> ABCE >>> AD-E >>> AD-E >>> >>> But what I need is >>> A-BCE >>> A-BCE >>> AD--E >>> AD--E >>> or >>> ABC-E >>> ABC-E >>> A--DE >>> A--DE >>> >>> Do you have any suggestion? >>> Thanks, >>> Shahab >>> >>> >>> >>> _______________________________________________ >>> biojava-dev mailing list >>> biojava-dev at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>> >> > > > > > From su24 at st-andrews.ac.uk Mon Sep 19 10:09:46 2011 From: su24 at st-andrews.ac.uk (Saif Ur-Rehman) Date: Mon, 19 Sep 2011 11:09:46 +0100 Subject: [Biojava-l] UniprotParser Message-ID: Dear all, I am having issues with the BioJava UniProt parser as detailed below: Code: BufferedReader br = new BufferedReader(new FileReader( files[index])); Namespace ns = RichObjectFactory.getDefaultNamespace(); RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); while(iterator.hasNext()) { try { RichSequence rs=iterator.nextRichSequence(); } catch (NoSuchElementException e) { } catch (BioException e) { e.printStackTrace(); } The file I am using is downloaded from the link: ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz The problem is that the parser works for a subset of the IDs within the file and on others throws an exception. Sample Exception stack trace: *** Start of trace ************************* at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) at uniprot.mp.main(mp.java:161) Caused by: org.biojava.bio.seq.io.ParseException: A Exception Has Occurred During Parsing. Please submit the details that follow to biojava-l at biojava.org or post a bug report to http://bugzilla.open-bio.org/ Format_object=org.biojavax.bio.seq.io.UniProtFormat Accession=P53031 Id= Comments= Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC 2512;RX MEDLINE=97082501; PubMed=8923737; DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative regulatory proteins from Candida tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). Stack trace follows .... at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) ... 1 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) ... 2 more org.biojava.bio.BioException: Could not read sequence at org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) at uniprot.mp.main(mp.java:161) Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied ********End of trace********************************** An example of an Id that worked is: ZYM1_SCHPO while an ID that didn't work is: ZUO1_YEAST Thanks a lot in advance. Cheers, Saif -- Saif Ur-Rehman Centre for Evolution, Genes and Genomics Harold Mitchell Building University of St Andrews St Andrews Fife KY16 9TH UK Tel: +44 131 5572556 Fax: +44 1334 463366 From khalil.elmazouari at gmail.com Mon Sep 19 16:35:07 2011 From: khalil.elmazouari at gmail.com (Khalil El Mazouari) Date: Mon, 19 Sep 2011 18:35:07 +0200 Subject: [Biojava-l] Biojava-l Digest, Vol 104, Issue 6 In-Reply-To: References: Message-ID: Hi take a look at http://en.wikipedia.org/wiki/Levenshtein_distance Regards, khalil On 19 Sep 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: [Biojava-dev] A question about multiple alignment > (Andreas Prlic) > 2. UniprotParser (Saif Ur-Rehman) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 18 Sep 2011 16:50:27 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] [Biojava-dev] A question about multiple > alignment > To: Shahab Kamali > Cc: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Shahab, > > Sounds like you want to use an identity matrix for the alignment.. > > Andreas > > On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: >> Thanks Andreas, >> I want two components that have different names to have 0 alignment score. >> My application is not about bio-compounds,so I can use anything else rather >> than ProteinSequence and AminoAcidCompound. I just need to align sequences >> of arbitrary alphabets. Could you suggest me a solution please? >> Thanks a lot, >> Shahab >> >> Quoting Andreas Prlic : >> >>> Hi Shahab, >>> >>> did you take a look at the substitution matrix, if it is scoring your >>> sequences according to your expectation? Looks like in your >>> theoretical example the alignment of B and D is favorable, i.e. it has >>> a positive alignment score.. >>> >>> Andreas >>> >>> >>> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >>> wrote: >>>> >>>> Hi, >>>> I am using BioJava in a pattern mining project. I want to align a set of >>>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>>> "ADE"). >>>> >>>> This is a part of my code: >>>> >>>> SubstitutionMatrix matrix = new >>>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>>> GuideTree gt = new >>>> GuideTree>>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>>> ? ? ? ? ? ?Profile profile = >>>> >>>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>>> new SimpleGapPenalty((short)0,(short)0),matrix); >>>> >>>> The result of the above code is: >>>> ABCE >>>> ABCE >>>> AD-E >>>> AD-E >>>> >>>> But what I need is >>>> A-BCE >>>> A-BCE >>>> AD--E >>>> AD--E >>>> or >>>> ABC-E >>>> ABC-E >>>> A--DE >>>> A--DE >>>> >>>> Do you have any suggestion? >>>> Thanks, >>>> Shahab >>>> >>>> >>>> >>>> _______________________________________________ >>>> biojava-dev mailing list >>>> biojava-dev at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>>> >>> >> >> >> >> >> > > > > ------------------------------ > > Message: 2 > Date: Mon, 19 Sep 2011 11:09:46 +0100 > From: Saif Ur-Rehman > Subject: [Biojava-l] UniprotParser > To: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Dear all, > > I am having issues with the BioJava UniProt parser as detailed below: > > Code: > > BufferedReader br = new BufferedReader(new FileReader( files[index])); > Namespace ns = RichObjectFactory.getDefaultNamespace(); > RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); > while(iterator.hasNext()) > { > try > { > RichSequence rs=iterator.nextRichSequence(); > } > > catch (NoSuchElementException e) > { > > } > catch (BioException e) > { > e.printStackTrace(); > } > > > > > The file I am using is downloaded from the link: > > ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz > > > The problem is that the parser works for a subset of the IDs within the file > and on others throws an exception. > > Sample Exception stack trace: > > *** Start of trace ************************* > > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) > at uniprot.mp.main(mp.java:161) > Caused by: org.biojava.bio.seq.io.ParseException: > > A Exception Has Occurred During Parsing. > Please submit the details that follow to biojava-l at biojava.org or post a bug > report to http://bugzilla.open-bio.org/ > > Format_object=org.biojavax.bio.seq.io.UniProtFormat > Accession=P53031 > Id= > Comments= > Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC > 2512;RX MEDLINE=97082501; PubMed=8923737; > DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA > Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative > regulatory proteins from Candida > tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). > Stack trace follows .... > > > at > org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) > ... 1 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) > ... 2 more > org.biojava.bio.BioException: Could not read sequence > at > org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) > at uniprot.mp.main(mp.java:161) > Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied > > ********End of trace********************************** > > An example of an Id that worked is: > > ZYM1_SCHPO > > while an ID that didn't work is: > > ZUO1_YEAST > > Thanks a lot in advance. > > Cheers, > Saif > > > -- > Saif Ur-Rehman > > Centre for Evolution, Genes and Genomics > Harold Mitchell Building > University of St Andrews > St Andrews > Fife > KY16 9TH > UK > > Tel: +44 131 5572556 > Fax: +44 1334 463366 > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 104, Issue 6 > ***************************************** From guoer713108 at gmail.com Tue Sep 20 04:18:55 2011 From: guoer713108 at gmail.com (quan zou) Date: Tue, 20 Sep 2011 12:18:55 +0800 Subject: [Biojava-l] why can't biojava fold RNA? Message-ID: Dear all, Is there any java program or jar which can fold a RNA sequence to a secondary structure? Such as RNAfold? Why RNAfold/ Vienna Package have not been contained in Biojava? Quan From andreas at sdsc.edu Tue Sep 20 15:11:58 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 20 Sep 2011 08:11:58 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: References: Message-ID: If all your code is in Java and you have binaries for some external software you can easily wrap it from Java and trigger the execution. Andreas On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: > Thanks, however, there is no java code. it cannot be imported into my java > project. > > 2011/9/20 Andreas Prlic >> >> Hi Quan, >> >> the Vienna RNA package is available as open source. ?Did you take a look >> at it? >> >> Andreas >> >> >> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >> > Dear all, >> > >> > ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >> > a >> > secondary structure? Such as RNAfold? >> > >> > ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >> > >> > ? ? ? ? ? ? ? ? Quan >> > _______________________________________________ >> > Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > > From jayunit100 at gmail.com Tue Sep 20 16:09:29 2011 From: jayunit100 at gmail.com (JAX) Date: Tue, 20 Sep 2011 12:09:29 -0400 Subject: [Biojava-l] Biojava-l Digest, Vol 104, Issue 7 In-Reply-To: References: Message-ID: <2E47CB3E-5859-47E7-846A-50618E95F925@gmail.com> pairwise similarity is better than levenstein for short sequences..... Just count the total number of matching letter pairs, divided by the length of the longest string between the two words. There is a great article about this online called "How to strike a match". We used it for the sequence mining here, and were able to find important homologs and reproduce known results : http://jb.asm.org/cgi/content/short/JB.00018-11v1 Jay Vyas MMSB UCHC On Sep 20, 2011, at 12:00 PM, biojava-l-request at lists.open-bio.org wrote: > Send Biojava-l mailing list submissions to > biojava-l at lists.open-bio.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.open-bio.org/mailman/listinfo/biojava-l > or, via email, send a message with subject or body 'help' to > biojava-l-request at lists.open-bio.org > > You can reach the person managing the list at > biojava-l-owner at lists.open-bio.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Biojava-l digest..." > > > Today's Topics: > > 1. Re: Biojava-l Digest, Vol 104, Issue 6 (Khalil El Mazouari) > 2. why can't biojava fold RNA? (quan zou) > 3. Re: why can't biojava fold RNA? (Andreas Prlic) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 19 Sep 2011 18:35:07 +0200 > From: Khalil El Mazouari > Subject: Re: [Biojava-l] Biojava-l Digest, Vol 104, Issue 6 > To: biojava-l at lists.open-bio.org > Message-ID: > Content-Type: text/plain; charset=us-ascii > > Hi > > take a look at http://en.wikipedia.org/wiki/Levenshtein_distance > > Regards, > > khalil > > > > On 19 Sep 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote: > >> Send Biojava-l mailing list submissions to >> biojava-l at lists.open-bio.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> or, via email, send a message with subject or body 'help' to >> biojava-l-request at lists.open-bio.org >> >> You can reach the person managing the list at >> biojava-l-owner at lists.open-bio.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of Biojava-l digest..." >> >> >> Today's Topics: >> >> 1. Re: [Biojava-dev] A question about multiple alignment >> (Andreas Prlic) >> 2. UniprotParser (Saif Ur-Rehman) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Sun, 18 Sep 2011 16:50:27 -0700 >> From: Andreas Prlic >> Subject: Re: [Biojava-l] [Biojava-dev] A question about multiple >> alignment >> To: Shahab Kamali >> Cc: biojava-l at biojava.org >> Message-ID: >> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Hi Shahab, >> >> Sounds like you want to use an identity matrix for the alignment.. >> >> Andreas >> >> On Sat, Sep 17, 2011 at 3:28 PM, Shahab Kamali wrote: >>> Thanks Andreas, >>> I want two components that have different names to have 0 alignment score. >>> My application is not about bio-compounds,so I can use anything else rather >>> than ProteinSequence and AminoAcidCompound. I just need to align sequences >>> of arbitrary alphabets. Could you suggest me a solution please? >>> Thanks a lot, >>> Shahab >>> >>> Quoting Andreas Prlic : >>> >>>> Hi Shahab, >>>> >>>> did you take a look at the substitution matrix, if it is scoring your >>>> sequences according to your expectation? Looks like in your >>>> theoretical example the alignment of B and D is favorable, i.e. it has >>>> a positive alignment score.. >>>> >>>> Andreas >>>> >>>> >>>> On Fri, Sep 16, 2011 at 10:56 AM, Shahab Kamali >>>> wrote: >>>>> >>>>> Hi, >>>>> I am using BioJava in a pattern mining project. I want to align a set of >>>>> relatively short sequences. For example to align {"ABCE", "ABCE", "ADE", >>>>> "ADE"). >>>>> >>>>> This is a part of my code: >>>>> >>>>> SubstitutionMatrix matrix = new >>>>> ? ? ? ? ? ? ? ? ? ?SimpleSubstitutionMatrix(); >>>>> GuideTree gt = new >>>>> GuideTree>>>> AminoAcidCompound>(lst,Alignments.getAllPairsScorers(lst, >>>>> ? ? ? ? ? ? ? ? ? Alignments.PairwiseSequenceScorerType.GLOBAL, ?new >>>>> ? ? ? ? ? ? ? ? ? SimpleGapPenalty((short)0,(short)0), matrix)); >>>>> ? ? ? ? ? ?Profile profile = >>>>> >>>>> Alignments.getProgressiveAlignment(gt,Alignments.ProfileProfileAlignerType.GLOBAL, >>>>> new SimpleGapPenalty((short)0,(short)0),matrix); >>>>> >>>>> The result of the above code is: >>>>> ABCE >>>>> ABCE >>>>> AD-E >>>>> AD-E >>>>> >>>>> But what I need is >>>>> A-BCE >>>>> A-BCE >>>>> AD--E >>>>> AD--E >>>>> or >>>>> ABC-E >>>>> ABC-E >>>>> A--DE >>>>> A--DE >>>>> >>>>> Do you have any suggestion? >>>>> Thanks, >>>>> Shahab >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> biojava-dev mailing list >>>>> biojava-dev at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev >>>>> >>>> >>> >>> >>> >>> >>> >> >> >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 19 Sep 2011 11:09:46 +0100 >> From: Saif Ur-Rehman >> Subject: [Biojava-l] UniprotParser >> To: biojava-l at biojava.org >> Message-ID: >> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> Dear all, >> >> I am having issues with the BioJava UniProt parser as detailed below: >> >> Code: >> >> BufferedReader br = new BufferedReader(new FileReader( files[index])); >> Namespace ns = RichObjectFactory.getDefaultNamespace(); >> RichSequenceIterator iterator = RichSequence.IOTools.readUniProt(br, ns); >> while(iterator.hasNext()) >> { >> try >> { >> RichSequence rs=iterator.nextRichSequence(); >> } >> >> catch (NoSuchElementException e) >> { >> >> } >> catch (BioException e) >> { >> e.printStackTrace(); >> } >> >> >> >> >> The file I am using is downloaded from the link: >> >> ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_fungi.dat.gz >> >> >> The problem is that the parser works for a subset of the IDs within the file >> and on others throws an exception. >> >> Sample Exception stack trace: >> >> *** Start of trace ************************* >> >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) >> at uniprot.mp.main(mp.java:161) >> Caused by: org.biojava.bio.seq.io.ParseException: >> >> A Exception Has Occurred During Parsing. >> Please submit the details that follow to biojava-l at biojava.org or post a bug >> report to http://bugzilla.open-bio.org/ >> >> Format_object=org.biojavax.bio.seq.io.UniProtFormat >> Accession=P53031 >> Id= >> Comments= >> Parse_block=RN [1]RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].RC STRAIN=NCYC >> 2512;RX MEDLINE=97082501; PubMed=8923737; >> DOI=10.1002/(SICI)1097-0061(199610)12:13<1321::AID-YEA27>3.0.CO;2-6;RA >> Rodriguez P.L., Ali R., Serrano R.;RT "CtCdc55p and CtHa13p: two putative >> regulatory proteins from Candida >> tropicalis with long acidic domains.";RL Yeast 12:1321-1329(1996). >> Stack trace follows .... >> >> >> at >> org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:615) >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:110) >> ... 1 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 >> at >> org.biojavax.bio.seq.io.UniProtFormat.readRichSequence(UniProtFormat.java:486) >> ... 2 more >> org.biojava.bio.BioException: Could not read sequence >> at >> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:113) >> at uniprot.mp.main(mp.java:161) >> Caused by: org.biojava.bio.seq.io.ParseException: Name has not been supplied >> >> ********End of trace********************************** >> >> An example of an Id that worked is: >> >> ZYM1_SCHPO >> >> while an ID that didn't work is: >> >> ZUO1_YEAST >> >> Thanks a lot in advance. >> >> Cheers, >> Saif >> >> >> -- >> Saif Ur-Rehman >> >> Centre for Evolution, Genes and Genomics >> Harold Mitchell Building >> University of St Andrews >> St Andrews >> Fife >> KY16 9TH >> UK >> >> Tel: +44 131 5572556 >> Fax: +44 1334 463366 >> >> >> ------------------------------ >> >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> >> End of Biojava-l Digest, Vol 104, Issue 6 >> ***************************************** > > > > > ------------------------------ > > Message: 2 > Date: Tue, 20 Sep 2011 12:18:55 +0800 > From: quan zou > Subject: [Biojava-l] why can't biojava fold RNA? > To: biojava-l at lists.open-bio.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Dear all, > > Is there any java program or jar which can fold a RNA sequence to a > secondary structure? Such as RNAfold? > > Why RNAfold/ Vienna Package have not been contained in Biojava? > > Quan > > > ------------------------------ > > Message: 3 > Date: Tue, 20 Sep 2011 08:11:58 -0700 > From: Andreas Prlic > Subject: Re: [Biojava-l] why can't biojava fold RNA? > To: quan zou > Cc: biojava-l at biojava.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > If all your code is in Java and you have binaries for some external > software you can easily wrap it from Java and trigger the execution. > > Andreas > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >> Thanks, however, there is no java code. it cannot be imported into my java >> project. >> >> 2011/9/20 Andreas Prlic >>> >>> Hi Quan, >>> >>> the Vienna RNA package is available as open source. ?Did you take a look >>> at it? >>> >>> Andreas >>> >>> >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>> Dear all, >>>> >>>> ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >>>> a >>>> secondary structure? Such as RNAfold? >>>> >>>> ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >>>> >>>> ? ? ? ? ? ? ? ? Quan >>>> _______________________________________________ >>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > > > ------------------------------ > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > End of Biojava-l Digest, Vol 104, Issue 7 > ***************************************** From daniel.quest at gmail.com Wed Sep 21 01:14:12 2011 From: daniel.quest at gmail.com (Daniel Quest) Date: Tue, 20 Sep 2011 21:14:12 -0400 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: References: Message-ID: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> I don't quite think this answers the question. If you want to execute c/c++/fortran/legacy code from java, you can do a system exec from within java. Or you can play around with jni but my experiences with that have not been good Does biojava have the ability to execute a stand alone program? If not I have some code lying around you guys can have Daniel Sent from my iPod On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > If all your code is in Java and you have binaries for some external > software you can easily wrap it from Java and trigger the execution. > > Andreas > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >> Thanks, however, there is no java code. it cannot be imported into my java >> project. >> >> 2011/9/20 Andreas Prlic >>> >>> Hi Quan, >>> >>> the Vienna RNA package is available as open source. Did you take a look >>> at it? >>> >>> Andreas >>> >>> >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>> Dear all, >>>> >>>> Is there any java program or jar which can fold a RNA sequence to >>>> a >>>> secondary structure? Such as RNAfold? >>>> >>>> Why RNAfold/ Vienna Package have not been contained in Biojava? >>>> >>>> Quan >>>> _______________________________________________ >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>> >> >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From phidias51 at gmail.com Wed Sep 21 01:33:56 2011 From: phidias51 at gmail.com (Mark Fortner) Date: Tue, 20 Sep 2011 18:33:56 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> References: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> Message-ID: You might try JNA instead of JNI. It's easier to use. However, I'd be reticent about calling it frequently in a loop, as there is some overhead involved. Mark On Tue, Sep 20, 2011 at 6:14 PM, Daniel Quest wrote: > I don't quite think this answers the question. > > If you want to execute c/c++/fortran/legacy code from java, you can do a > system exec from within java. Or you can play around with jni but my > experiences with that have not been good > > Does biojava have the ability to execute a stand alone program? If not I > have some code lying around you guys can have > > Daniel > > Sent from my iPod > > On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > > > If all your code is in Java and you have binaries for some external > > software you can easily wrap it from Java and trigger the execution. > > > > Andreas > > > > On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: > >> Thanks, however, there is no java code. it cannot be imported into my > java > >> project. > >> > >> 2011/9/20 Andreas Prlic > >>> > >>> Hi Quan, > >>> > >>> the Vienna RNA package is available as open source. Did you take a > look > >>> at it? > >>> > >>> Andreas > >>> > >>> > >>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou > wrote: > >>>> Dear all, > >>>> > >>>> Is there any java program or jar which can fold a RNA sequence > to > >>>> a > >>>> secondary structure? Such as RNAfold? > >>>> > >>>> Why RNAfold/ Vienna Package have not been contained in Biojava? > >>>> > >>>> Quan > >>>> _______________________________________________ > >>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/biojava-l > >>>> > >> > >> > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > From andreas at sdsc.edu Wed Sep 21 01:55:41 2011 From: andreas at sdsc.edu (Andreas Prlic) Date: Tue, 20 Sep 2011 18:55:41 -0700 Subject: [Biojava-l] why can't biojava fold RNA? In-Reply-To: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> References: <941905BD-0FA0-4F20-BC85-100E2F595D9B@gmail.com> Message-ID: A simple way to run an external program is via ProcessBuilder (since Java 1.5) http://download.oracle.com/javase/1.5.0/docs/api/java/lang/ProcessBuilder.html Andreas On Tue, Sep 20, 2011 at 6:14 PM, Daniel Quest wrote: > I don't quite think this answers the question. > > If you want to execute c/c++/fortran/legacy code from java, you can do a system exec from within java. ?Or you can play around with jni but my experiences with that have not been good > > Does biojava have the ability to execute a stand alone program? ?If not I have some code lying around you guys can have > > Daniel > > Sent from my iPod > > On Sep 20, 2011, at 11:11 AM, Andreas Prlic wrote: > >> If all your code is in Java and you have binaries for some external >> software you can easily wrap it from Java and trigger the execution. >> >> Andreas >> >> On Tue, Sep 20, 2011 at 2:09 AM, quan zou wrote: >>> Thanks, however, there is no java code. it cannot be imported into my java >>> project. >>> >>> 2011/9/20 Andreas Prlic >>>> >>>> Hi Quan, >>>> >>>> the Vienna RNA package is available as open source. ?Did you take a look >>>> at it? >>>> >>>> Andreas >>>> >>>> >>>> On Mon, Sep 19, 2011 at 9:18 PM, quan zou wrote: >>>>> Dear all, >>>>> >>>>> ? ? ? ?Is there any java program or jar which can fold a RNA sequence to >>>>> a >>>>> secondary structure? Such as RNAfold? >>>>> >>>>> ? ? ? Why RNAfold/ Vienna Package have not been contained in Biojava? >>>>> >>>>> ? ? ? ? ? ? ? ? Quan >>>>> _______________________________________________ >>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>>>> >>> >>> >> >> _______________________________________________ >> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l > From hk3 at sanger.ac.uk Wed Sep 21 08:36:35 2011 From: hk3 at sanger.ac.uk (Hashem Koohy) Date: Wed, 21 Sep 2011 09:36:35 +0100 Subject: [Biojava-l] NullPointerException in Hidden Markov Model Message-ID: Hi, I have set up a HMM model and I am trying to get the viterbi path printed out, instead I get the following error message. I feel it must be some thing silly but I cannot spot it. I really appreciate any clue. Exception in thread "main" java.lang.NullPointerException at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:648) at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:512) at hmmwithdirichletprior.VITERBI2.main(VITERBI2.java:188) This is how I call my HMM model from main function: MarkovModel mm = block.makeMarkovModel(observedSeqAlphabet, tranProb, strProb, statesAndDirPar, "dirichletMM"); DP dp = new SingleDP(mm); SymbolList [] symList = {symbolList}; StatePath viterbiPath = dp.viterbi(symList, ScoreType.PROBABILITY); And here is my makeMarkovModel method: private static MarkovModel makeMarkovModel( SimpleAlphabet alphabet, double [][] transitionMatrix, double [] startProbabilities, LinkedHashMap statesAndCorresponingDirichletParameters, String modelName ) throws Exception{ SimpleMarkovModel mm = new SimpleMarkovModel(1, alphabet, modelName ); int [] advance = { 1 }; int numberOfStates = statesAndCorresponingDirichletParameters.size(); ArrayList stateNames = new ArrayList(); ArrayList arraysOfDirichletParameters = new ArrayList(); for(Map.Entry me:statesAndCorresponingDirichletParameters.entrySet() ){ double oneDirichletPar [] = me.getValue(); arraysOfDirichletParameters.add(oneDirichletPar); String oneState = me.getKey(); stateNames.add(oneState); } //Distribution initiation Distribution [] dists = new Distribution[numberOfStates]; EmissionState [] emissionStates = new SimpleEmissionState[numberOfStates]; for(int i = 0; i< numberOfStates;i++){ dists[i] = DistributionFactory.DEFAULT.createDistribution(alphabet); String oneState = stateNames.get(i); emissionStates[i] = new SimpleEmissionState(oneState, Annotation.EMPTY_ANNOTATION,advance,dists[i] ); } //add states to the model for(State s:emissionStates ){ try{ mm.addState(s); } catch(Exception e){ throw new Exception("Can't add states to model!"); } } //create transitions State magic = mm.magicalState(); for(State i:emissionStates ){ mm.createTransition(magic, i); for(State j: emissionStates){ mm.createTransition(i, j); } } //set up emission scores for(Iterator i = alphabet.iterator(); i.hasNext();){ AtomicSymbol oneSym = (AtomicSymbol) i.next(); double [] symbolsInThisSymbolAsArrayOfDoubles = makeArrayOfDoublesFromASymbol(oneSym); for(int d =0 ; d< dists.length; d++){ double dirichletPar [] = arraysOfDirichletParameters.get(d); double oneDensity = DirichletDist.density(dirichletPar, symbolsInThisSymbolAsArrayOfDoubles); dists[d].setWeight(oneSym,oneDensity ); } } //set transition scores Distribution transDist; //magical to others transDist = mm.getWeights(mm.magicalState()); for(int i=0; i Hi, I just committed a couple of new features related to protein structure alignments and working with protein domains: - better support for SCOP domains: - rather than using a local SCOP installation the default is now to fetch SCOP domain data via remote web service calls (much more memory friendly) - the user interface for structure alignments now has a new auto-suggest panel that makes it easier to enter SCOP domain IDs - structures that don't have SCOP domains assigned, can get automatically split into domains with ProteinDomainParser (also can be fetched from remote) - database searches now display % sequence ID in the alignment as a new column. Andreas From member at linkedin.com Thu Sep 29 09:46:20 2011 From: member at linkedin.com (Huijie Qiao via LinkedIn) Date: Thu, 29 Sep 2011 09:46:20 +0000 (UTC) Subject: [Biojava-l] Invitation to connect on LinkedIn Message-ID: <1746095084.5009685.1317289580465.JavaMail.app@ela4-bed79.prod> LinkedIn ------------ Huijie Qiao requested to add you as a connection on LinkedIn: ------------------------------------------ Christopher, I'd like to add you to my professional network on LinkedIn. Accept invitation from Huijie Qiao http://www.linkedin.com/e/triamj-gt5k7df2-1n/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I2866293_16/6lColZJrmZznQNdhjRQnOpBtn9QfmhBt71BoSd1p65Lr6lOfPoNnPcVczoSe399bQtmrkhhdlhObPsRd38Ve3gVcPgLrCBxbOYWrSlI/EML_comm_afe/?hs=false&tok=0zua711XEa0kY1 View invitation from Huijie Qiao http://www.linkedin.com/e/triamj-gt5k7df2-1n/zz8MFWe4hXWC7m_VsDmWDUKsZA0p5qlGHsp1420HEwv/blk/I2866293_16/dz5vcPAOdzoUcAALqnpPbOYWrSlI/svi/?hs=false&tok=2TiMaI0VQa0kY1 ------------------------------------------ Why might connecting with Huijie Qiao be a good idea? Have a question? Huijie Qiao's network will probably have an answer: You can use LinkedIn Answers to distribute your professional questions to Huijie Qiao and your extended network. You can get high-quality answers from experienced professionals. http://www.linkedin.com/e/triamj-gt5k7df2-1n/ash/inv19_ayn/?hs=false&tok=1sMGVLd3Ia0kY1 -- (c) 2011, LinkedIn Corporation