From raphael.andre.bauer at gmail.com Tue Feb 3 04:28:52 2009 From: raphael.andre.bauer at gmail.com (=?UTF-8?Q?Raphael_Andr=C3=A9_Bauer?=) Date: Tue, 3 Feb 2009 10:28:52 +0100 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain Message-ID: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> Hey everybody, suppose the following PDB file: .... ATOM 2847 OD1 ASP A 370 14.470 9.116 16.553 1.00 57.47 O ATOM 2848 OD2 ASP A 370 12.601 9.313 15.410 1.00 57.77 O ATOM 2849 N ASN A 371 17.114 9.968 15.979 1.00 46.04 N ATOM 2850 CA ASN A 371 18.177 9.691 16.946 1.00 46.67 C ATOM 2851 C ASN A 371 19.555 10.072 16.403 1.00 46.52 C ATOM 2852 O ASN A 371 19.848 11.250 16.177 1.00 43.37 O ATOM 2853 CB ASN A 371 18.174 8.209 17.345 1.00 49.12 C ATOM 2854 CG ASN A 371 16.969 7.829 18.189 1.00 52.62 C ATOM 2855 OD1 ASN A 371 16.762 6.652 18.502 1.00 55.93 O ATOM 2856 ND2 ASN A 371 16.168 8.824 18.568 1.00 52.06 N TER 2857 ASN A 371 HETATM 2858 ZN ZN A 400 43.465 12.734 15.463 1.00 24.94 ZN HETATM 2859 ZN ZN A 401 46.414 14.165 16.934 1.00 30.14 ZN HETATM 2860 N9 ADE A1114 50.418 9.820 20.124 1.00 80.38 N HETATM 2861 C8 ADE A1114 50.489 11.169 20.334 1.00 80.76 C HETATM 2862 N7 ADE A1114 49.492 11.773 19.748 1.00 80.58 N HETATM 2863 C5 ADE A1114 48.715 10.855 19.124 1.00 79.77 C .... 1) If I use //struc being the parsed pdb file returnChain = struc.getChainByPDB("A", 1); I get chain A EXCEPT everything after the TER (ZN and ADE is excluded in this example). 2) If I iterate over the complete file using something like //struc being the parsed pdb file int nrModels = struc.nrModels(); for (int modelNr = 0; modelNr < nrModels; modelNr++) { List chains = struc.getModel(modelNr); int nrChains = chains.size(); for (int chainNr = 0; chainNr < nrChains; chainNr++) { //this chain contains also the ZN and the ADE So maybe I am getting something wrong here. Is this effect wanted by the PDB parser? Thanks! Raphael From andreas.prlic at gmail.com Tue Feb 3 10:30:48 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Tue, 3 Feb 2009 07:30:48 -0800 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain In-Reply-To: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> References: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> Message-ID: <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> Hi Rafael, > > suppose the following PDB file: > ... > > 1) If I use > > //struc being the parsed pdb file > returnChain = struc.getChainByPDB("A", 1); > > I get chain A EXCEPT everything after the TER (ZN and ADE is excluded > in this example). This seems wrong. You should get the full chain here. TERs are getting ignored. Let me check what might cause this. You are on the latest version from svn? > 2) If I iterate over the complete file using something like > > //struc being the parsed pdb file > int nrModels = struc.nrModels(); > > for (int modelNr = 0; modelNr < nrModels; modelNr++) { > List chains = struc.getModel(modelNr); > int nrChains = chains.size(); > for (int chainNr = 0; chainNr < nrChains; chainNr++) { > > //this chain contains also the ZN and the ADE That's what I would expect... Andreas > > > > > So maybe I am getting something wrong here. Is this effect wanted by > the PDB parser? > > > Thanks! > > > Raphael > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From raphael.andre.bauer at gmail.com Tue Feb 3 14:01:09 2009 From: raphael.andre.bauer at gmail.com (=?UTF-8?Q?Raphael_Andr=C3=A9_Bauer?=) Date: Tue, 3 Feb 2009 20:01:09 +0100 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain In-Reply-To: <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> References: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> Message-ID: <9b46aa30902031101r77aa6236qc6e92c4140f39ca8@mail.gmail.com> On Tue, Feb 3, 2009 at 4:30 PM, Andreas Prlic wrote: ... >> 1) If I use >> >> //struc being the parsed pdb file >> returnChain = struc.getChainByPDB("A", 1); >> >> I get chain A EXCEPT everything after the TER (ZN and ADE is excluded >> in this example). > > This seems wrong. You should get the full chain here. TERs are getting > ignored. Let me check what might cause this. You are on the latest version > from svn? Thanks for the quick response. And sorry for having bothered you - after writing a quick unit test it became pretty clear that it is my own fault. I thought I double checked that yesterday (and I should write the unit test before writing to the list - I know). The "supposed-to-be-bug-but-was-my-own-failure" came from setParseCAOnly(). Sorry and thanks again! Raphael From aumanga at biggjapan.com Thu Feb 5 00:57:37 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Thu, 05 Feb 2009 14:57:37 +0900 Subject: [Biojava-l] [Off the Topic] Homology prediction of large number of sequences! Message-ID: <498A7FD1.2060203@biggjapan.com> Greetings all, I have about 60,000 sequences and want to predict the 3D structure (Homology modelling) for all the structures.I decided to use Modeller first , and seems that would take months to predict all those structures. Is there any HighPerformance prediction server which I have use for this? Not submitting data through webbased interface , I want to automate the process , cuz 60K is alot of data! Thanks in advance, umanga From markjschreiber at gmail.com Thu Feb 5 08:51:39 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 5 Feb 2009 21:51:39 +0800 Subject: [Biojava-l] [Off the Topic] Homology prediction of large number of sequences! In-Reply-To: <498A7FD1.2060203@biggjapan.com> References: <498A7FD1.2060203@biggjapan.com> Message-ID: <93b45ca50902050551s11574e7br9526f08ddae156f4@mail.gmail.com> I don't know about high performance but the problem is very parallel so if you can get your hands on a cluster you can run one sequence on each processor. Maybe there is a cloud compute facility you could use if a cluster is not available. You might also want to try a pilot experiment first. There is bound to be some redundancy in some of those proteins which will give nearly identical models. - Mark On Thu, Feb 5, 2009 at 1:57 PM, Ashika Umanga Umagiliya wrote: > > Greetings all, > > I have about 60,000 sequences and want to predict the 3D structure (Homology modelling) for all the structures.I decided to use Modeller first , and seems that would take months to predict all those structures. > Is there any HighPerformance prediction server which I have use for this? Not submitting data through webbased interface , I want to automate the process , cuz 60K is alot of data! > > Thanks in advance, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From willishf at ufl.edu Fri Feb 6 14:39:17 2009 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 6 Feb 2009 14:39:17 -0500 Subject: [Biojava-l] Phylogenetic Trees Message-ID: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> I was looking for Java code to construct a phylogenetic tree from aligned sequence data. I came across org.biojavax.bio.phylo as an experimental package in the latest release of BioJava. The docs were thin and couldn't find any code examples so was wondering if anyone has used the package with success. I also couldn't decide if the package is intended to load pre-constructed trees as Java objects or could be used to construct a tree from aligned sequence data. Does anyone know of an accepted Java package for tree construction besides what is in JalView? Thanks Scooter From simon.rayner.cn at gmail.com Sun Feb 8 04:45:46 2009 From: simon.rayner.cn at gmail.com (simon rayner) Date: Sun, 8 Feb 2009 04:45:46 -0500 Subject: [Biojava-l] alignments Message-ID: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> Hi, i have a question about Alignment and FlexibleAlignment objects. Basically, i have a sequence alignment in FASTA format so followed a code sample i found from a previous posted query: BufferedReader br = new BufferedReader(new FileReader("file.txt")); FastaAlignmentFormat faf = new FastaAlignmentFormat(); Alignment aligned = faf.read( br ); so now i have an alignment stored in an Alignment object. 1. I do something with these sequences using the access methods for the Alignment object. something like java.util.Iterator iter = l.listIterator(); while(iter.hasNext()) { String currLabel = (String) iter.next(); String currAA = aligned.symbolListForLabel(currLabel) .seqString().substring(base, base+1); } 2. Now i want to do the same analysis but on a random sample of sequences from this alignment. So to get a subset of these sequences i created a FlexibleAlignment and grabbed the sequences i need from the original alignment... Alignment al; Iterator l = al.symbolListIterator(); java.util.List names = al.getLabels(); Iterator n = names.listIterator(); java.util.ArrayList seqList = new java.util.ArrayList(1); String refs = ""; while(refs.length() < al.length()) { refs = refs.concat("x"); } try{ FlexibleAlignment FA = new FlexibleAlignment(seqList); SymbolList aa = ProteinTools.createProtein(refs); seqList.add( new SimpleAlignmentElement( "reference", aa, new RangeLocation(1, al.length())) ); while (l.hasNext()) { try{ SymbolList sal = (SymbolList)l.next(); String name = (String)n.next(); FA.addSequence(new SimpleAlignmentElement(name, sal, new RangeLocation(1, al.length()))); } catch (org.biojava.bio.BioException ex) { System.out.print("couldn't add sequence to alignment\n"); System.err.print(ex); } } this.aligned = FA; } catch (org.biojava.bio.BioException ex) { System.out.print("couldn't add sequence to alignment\n"); System.err.print(ex); } So now i have my alignment of random sequences in a FlexibleAlignment object, but my original analysis was written for an Alignment object (because that's the format that FastaAlignmentFormat spat back at me. So, can i cast a FlexibleAlignment object into an Alignment object? or am i just going about it the wrong way? p.s. Why are there so many diverse Alignment related objects? (AbstractULAlignment, AbstractULAlignment.SubULAlignment, FlexibleAlignment, RelabeledAlignment, EmptyPairwiseAlignment, SimpleAlignment, HappyAlignment, SadAlignment..) Is there any documentation (other than the standard java docs) to clarify which object when? thanks! From holland at eaglegenomics.com Mon Feb 9 09:43:03 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 09 Feb 2009 14:43:03 +0000 Subject: [Biojava-l] Phylogenetic Trees In-Reply-To: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> References: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> Message-ID: <499040F7.3050009@eaglegenomics.com> The package you found in biojavax was written with the intention of parsing existing tree structures from NEXUS files, and to perform basic analyses of them (neighbour joining, etc.). The code can represent trees regardless of their source, and can output them as NEXUS files, but there are no algorithm implementations in that package for constructing new trees and so you'd probably have to write those from scratch. cheers, Richard Scooter Willis wrote: > I was looking for Java code to construct a phylogenetic tree from > aligned sequence data. I came across org.biojavax.bio.phylo as an > experimental package in the latest release of BioJava. The docs were > thin and couldn't find any code examples so was wondering if anyone > has used the package with success. > > I also couldn't decide if the package is intended to load > pre-constructed trees as Java objects or could be used to construct a > tree from aligned sequence data. > > Does anyone know of an accepted Java package for tree construction > besides what is in JalView? > > Thanks > > Scooter > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Feb 9 09:47:45 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 09 Feb 2009 14:47:45 +0000 Subject: [Biojava-l] alignments In-Reply-To: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> References: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> Message-ID: <49904211.7010404@eaglegenomics.com> Yes, you can cast it back. In fact you don't need to actually cast it at all, as Java will do that for you automatically. FlexibleAlignment is an implementation of the Alignment interface, which defines general behaviour for all alignments. If your algorithm is written to accept a parameter of type Alignment, then it will accept any object which implements the Alignment interface, including FlexibleAlignment, Happy and SadAlignment, etc. etc. Unfortunately I'm not sure why there are so many different kinds of alignment objects. I think it was something to do with wanting to make some that are modifiable, and others that are read-only, then making different kinds that store the underlying alignment data in different/more efficient/faster/cleverer representations. It does rather over-complicate things, however if your algorithm only requires Alignment as the type of its parameter, then you can safely use all and any of these as they all implement the Alignment interface. cheers, Richard simon rayner wrote: > Hi, > > i have a question about Alignment and FlexibleAlignment objects. > Basically, i have a sequence alignment in FASTA format so followed a code > sample i found from a previous posted query: > > BufferedReader br = new BufferedReader(new FileReader("file.txt")); > FastaAlignmentFormat faf = new FastaAlignmentFormat(); > Alignment aligned = faf.read( br ); > > so now i have an alignment stored in an Alignment object. > > 1. I do something with these sequences using the access methods for > the Alignment object. > something like > > java.util.Iterator iter = l.listIterator(); > while(iter.hasNext()) > { > String currLabel = (String) iter.next(); > String currAA > = aligned.symbolListForLabel(currLabel) > .seqString().substring(base, base+1); > } > > 2. Now i want to do the same analysis but on a random sample of > sequences from this alignment. > So to get a subset of these sequences i created a FlexibleAlignment > and grabbed the sequences > i need from the original alignment... > > Alignment al; > Iterator l = al.symbolListIterator(); > java.util.List names = al.getLabels(); > Iterator n = names.listIterator(); > java.util.ArrayList seqList = new java.util.ArrayList(1); > String refs = ""; > while(refs.length() < al.length()) > { > refs = refs.concat("x"); > } > > try{ > FlexibleAlignment FA = new FlexibleAlignment(seqList); > SymbolList aa = ProteinTools.createProtein(refs); > > seqList.add( > new SimpleAlignmentElement( > "reference", > aa, > new RangeLocation(1, al.length())) > ); > while (l.hasNext()) > { > try{ > SymbolList sal = (SymbolList)l.next(); > String name = (String)n.next(); > > FA.addSequence(new SimpleAlignmentElement(name, > sal, > new RangeLocation(1, al.length()))); > > } > catch (org.biojava.bio.BioException ex) > { > System.out.print("couldn't add sequence to alignment\n"); > System.err.print(ex); > } > } > this.aligned = FA; > } > catch (org.biojava.bio.BioException ex) > { > System.out.print("couldn't add sequence to alignment\n"); > System.err.print(ex); > } > > So now i have my alignment of random sequences in a FlexibleAlignment > object, but my original analysis > was written for an Alignment object (because that's the format that > FastaAlignmentFormat spat back at me. > > So, can i cast a FlexibleAlignment object into an Alignment object? or > am i just going about it the wrong way? > > p.s. Why are there so many diverse Alignment related objects? > (AbstractULAlignment, > AbstractULAlignment.SubULAlignment, FlexibleAlignment, RelabeledAlignment, > EmptyPairwiseAlignment, SimpleAlignment, HappyAlignment, > SadAlignment..) Is there any documentation (other than > the standard java docs) to clarify which object when? > > thanks! > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From aumanga at biggjapan.com Mon Feb 9 20:27:34 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Tue, 10 Feb 2009 10:27:34 +0900 Subject: [Biojava-l] [Off the topic] Selecting template from clustering tree? Message-ID: <4990D806.2030204@biggjapan.com> Greetings, I use 'Modeller' for homology modeling and, during the process I get a clustering tree shown as following.And I have to select the best template using this tree.I was wondering whether there are any tools/features in BioJava that I can use for this? I don't know the conditions to select the best template (I am from Computer science background),but seems I can you fuzzylogic ,if I know the criteria well. Any tips? Best Regards, umanga ------------------------------------------------------------------------------------------------- Sequence identity comparison (ID_TABLE): Diagonal ... number of residues; Upper triangle ... number of identical residues; Lower triangle ... % sequence identity, id/min(length). 1b8pA @11bdmA @11civA @25mdhA @27mdhA @21smkA @2 1b8pA @1 327 194 147 151 153 49 1bdmA @1 61 318 152 167 155 56 1civA @2 45 48 374 139 304 53 5mdhA @2 46 53 42 333 139 57 7mdhA @2 47 49 87 42 351 48 1smkA @2 16 18 17 18 15 313 Weighted pair-group average clustering based on a distance matrix: .----------------------- 1b8pA @1.9 39.0000 | .-------------------------------- 1bdmA @1.8 50.5000 | .------------------------------------ 5mdhA @2.4 55.3750 | | .--- 1civA @2.8 13.0000 | | .---------------------------------------------------------- 7mdhA @2.4 83.2500 | .------------------------------------------------------------ 1smkA @2.5 +----+----+----+----+----+----+----+----+----+----+----+----+ 86.0600 73.4150 60.7700 48.1250 35.4800 22.8350 10.1900 79.7375 67.0925 54.4475 41.8025 29.1575 16.5125 From bopfannkuche at gmx.de Thu Feb 12 07:06:36 2009 From: bopfannkuche at gmx.de (=?ISO-8859-15?Q?Bj=F6rn_Ole_Pfannkuche?=) Date: Thu, 12 Feb 2009 13:06:36 +0100 Subject: [Biojava-l] DNA/DNA Folding Message-ID: <499410CC.8090704@gmx.de> Hi! Has anyone of you writen a programme or classes which aim on folding DNA? Like the projects UNAFold or PairFold, but in Java... Greetz BO From willishf at ufl.edu Thu Feb 12 07:13:57 2009 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 12 Feb 2009 07:13:57 -0500 Subject: [Biojava-l] Multiple Sequence Viewer Message-ID: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> I am getting my feet wet with the various parts and pieces of BioJava and needed a multiple sequence viewer. Searched around in the code and did some google searches and looks like one does not exist but someone faked one with a MultipleLineReader to render each sequence as a label. As part of the learning exercise I cloned SequencePanel to MultipleSequencePanel and SequenceRenderContext to MultipleSequenceRenderContext(to define returning a collection of sequences) and SymbolSequenceRenderer to SymbolMultipleSequenceRenderer. In SymbolMultipleSequenceRenderer I handle the rendering of multiple sequences. I need to add in labels as a non scroll region on the left and then add support for horizontal and vertical scroll. Before I finish the last 10% of what I think need to be done and always takes the longest does a multiple sequence viewer exist in BioJava and I can't find it? Thanks Scooter Willis From russ at kepler-eng.com Thu Feb 12 09:31:39 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 12 Feb 2009 07:31:39 -0700 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: <200902120731.39355.russ@kepler-eng.com> On Thursday 12 February 2009 05:13:57 Scooter Willis wrote: > Before I finish the last 10% of what I think need to be done and always > takes the longest does a multiple sequence viewer exist in BioJava and I > can't find it? In mine I simply have a TranslatedSequencePanel with a MultiLineRenderer, the MLR having a renderer for each sequence or sequence element). This gave me the control over the individual sequence display I needed for cursors and such. From willishf at ufl.edu Thu Feb 12 10:20:55 2009 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 12 Feb 2009 10:20:55 -0500 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <200902120731.39355.russ@kepler-eng.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <200902120731.39355.russ@kepler-eng.com> Message-ID: <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> My plan is to model the multiple sequence viewer used in JALView. Sequence name on the left to fire events when clicked. I also typically work with large sequences so need both horizontal and vertical scroll which does not appear to be in the current viewers. I do see support for viewing a range of a sequence but nothing to scroll or I could be missing something on how the current viewer would scroll not using a ScrollPane. This should be something easy to do in BioJava based on the importance of viewing multiple sequence alignments. Thanks Scooter Willis On Thu, Feb 12, 2009 at 9:31 AM, Russ Kepler wrote: > On Thursday 12 February 2009 05:13:57 Scooter Willis wrote: > > > Before I finish the last 10% of what I think need to be done and always > > takes the longest does a multiple sequence viewer exist in BioJava and I > > can't find it? > > In mine I simply have a TranslatedSequencePanel with a MultiLineRenderer, > the > MLR having a renderer for each sequence or sequence element). This gave me > the control over the individual sequence display I needed for cursors and > such. > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From russ at kepler-eng.com Thu Feb 12 10:55:19 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 12 Feb 2009 08:55:19 -0700 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <200902120731.39355.russ@kepler-eng.com> <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> Message-ID: <200902120855.20480.russ@kepler-eng.com> On Thursday 12 February 2009 08:20:55 Scooter Willis wrote: > My plan is to model the multiple sequence viewer used in JALView. Sequence > name on the left to fire events when clicked. I also typically work with > large sequences so need both horizontal and vertical scroll which does not > appear to be in the current viewers. I do see support for viewing a range > of a sequence but nothing to scroll or I could be missing something on how > the current viewer would scroll not using a ScrollPane. I guess I don't see the problem with using a ScrollPane, it works well in the app I've written. > This should be something easy to do in BioJava based on the importance of > viewing multiple sequence alignments. It not all *that* hard, in my case it's a ScrollPane with a TranslatedSequencePanel with a MultiLineRenderer with a bunch of GappedRenderers containing individual sequence or other renderers. The TSP holds the alignment and feeds the individual renderers with the appropriate sequence. Somewhere around here I have something I scratched up in BioJava 1.4 that worked as a Clustal alignment viewer, I'll see if I can find it and pop it your way. From hlapp at gmx.net Fri Feb 13 11:53:53 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Feb 2009 11:53:53 -0500 Subject: [Biojava-l] Google Summer of Code: Call for Bio* Volunteers Message-ID: <1F570555-12DF-42DF-8D0E-95AAE298D76A@gmx.net> Google is committed to run the Summer of Code program [1] again this year. It will be for the 5th time. In broad strokes, the program funds what you might call remote summer internships for students to contribute to an open-source software project. Participating projects (or umbrella organizations) provide project ideas and supply mentors that guide the work on those. Students apply to a project within the program with specific project ideas, based on those suggested or based on their own idea, get ranked by the mentors of the project, and those accepted into the program get paired up with mentors. Projects are chiefly about programming, the coding period is 3 months (Jun-Aug), and there is no travel required by either student or mentor. The program is global; other than the US trade restrictions that Google is under, there are no restrictions as to where student or mentor reside. The main motivations behind the program are to recruit new contributors to open-source projects, and to produce more open-source code. See the program FAQs [2] for more information. I've had the honor of being part of the program for the last two years, administering NESCent's participation as an organization [3] and in 2007 mentoring a student. I have to say I find it the most awesome open-source program since sliced bread (or the invention of BLAST if that means more to you). Despite that and sadly enough, there has been a dearth of participating bioinformatics projects (though some notable ones, such as CytoScape have participated). There have been two Bio* Summer of Code projects under the NESCent umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to volunteer to take the lead on and administer a full-blown participation of O|B|F as a Bio* umbrella organization, provided 1) at least one Bio* person volunteers to serve as backup administrator, and 2) enough Bio* contributors volunteer to serve as prospective mentors. Mentoring involves participating in creating the page of project ideas (I'd provide template and guidance), corresponding with applicants who have questions, participating in student application ranking, and for primary mentors (those directly assigned to a student) based on empirical evidence at least 5hrs/week of time spent with the student to help him/her get over obstacles or avoid wrong paths. I think almost all mentors would concur that the experience was very gratifying, but as a mentor you will be spending a non-negligible amount of time with the student. I think it is the student-mentor pairing and interaction, not the stipend, that in the end makes the participation for students uniquely productive in terms of learning, and different from simply contributing to the project of choice (which they could always do). For a personal impression for how the program is from a mentor perspective, I'll let Chris Fields speak who was the mentor for the 2008 phyloXML in BioPerl project. From a student's perspective, I'll leave it to the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment (if they are still on the list). So if you think this is a good idea for Bio* to be part of, if you would like to help in some way, if you can see yourself as a mentor, or if you are a lurking would-be student, please let yourself be heard. Email either to the list or to me. Cheers, -hilmar [1] http://code.google.com/soc/2008 [2] http://code.google.com/opensource/gsoc/2009/faqs.html [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 [4] http://biojava.org/wiki/BioJava:PhyloSOC07 [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jeedward at yahoo.com Fri Feb 13 19:23:48 2009 From: jeedward at yahoo.com (John Edward) Date: Fri, 13 Feb 2009 16:23:48 -0800 (PST) Subject: [Biojava-l] Draft paper submission deadline extended: BCBGC-09 Message-ID: <421505.231.qm@web45914.mail.sp1.yahoo.com> Draft paper submission deadline extended: BCBGC-09 ? The deadline for draft paper submission at the 2009 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-09) (website: http://www.PromoteResearch.org ) is extended due to numerous requests from the authors. The conference will be held during July 13-16 2009 in Orlando, FL, USA. We invite draft paper submissions. The conference will take place at the same time and venue where several other international conferences are taking place. The other conferences include: ????????? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-09) ????????? International Conference on Automation, Robotics and Control Systems (ARCS-09) ????????? International Conference on Enterprise Information Systems and Web Technologies (EISWT-09) ????????? International Conference on High Performance Computing, Networking and Communication Systems (HPCNCS-09) ????????? International Conference on Information Security and Privacy (ISP-09) ????????? International Conference on Recent Advances in Information Technology and Applications (RAITA-09) ????????? International Conference on Software Engineering Theory and Practice (SETP-09) ????????? International Conference on Theory and Applications of Computational Science (TACS-09) ????????? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-09) ? The website http://www.PromoteResearch.org contains more details. ? Sincerely John Edward Publicity committee ? ? From anantpossible at gmail.com Sat Feb 14 00:33:25 2009 From: anantpossible at gmail.com (Anant Jain) Date: Sat, 14 Feb 2009 11:03:25 +0530 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: Good Morning, I am student of B.Tech Bioinformatics (4th year) & learning Biojava from Internet. Can you sent me some important pdf documents on BioJava. Thank You ANANT JAIN DYPBBI, PUNE, INDIA From holland at eaglegenomics.com Sat Feb 14 01:39:40 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Sat, 14 Feb 2009 06:39:40 +0000 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: <4996672C.2050907@eaglegenomics.com> Everything you need to know to get started is on our website: http://www.biojava.org/ thanks, Richard Anant Jain wrote: > Good Morning, > I am student of B.Tech Bioinformatics (4th year) & learning Biojava from > Internet. Can you sent me some important pdf documents on BioJava. > Thank You > ANANT JAIN > DYPBBI, PUNE, INDIA > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Sat Feb 14 02:17:48 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sat, 14 Feb 2009 15:17:48 +0800 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <4996672C.2050907@eaglegenomics.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <4996672C.2050907@eaglegenomics.com> Message-ID: <93b45ca50902132317s34fbbab6pacd9f68080e10940@mail.gmail.com> Also look at the recent publication (http://dx.doi.org/10.1093/bioinformatics/btn397) - Mark On Sat, Feb 14, 2009 at 2:39 PM, Richard Holland wrote: > > Everything you need to know to get started is on our website: > > http://www.biojava.org/ > > thanks, > Richard > > Anant Jain wrote: > > Good Morning, > > I am student of B.Tech Bioinformatics (4th year) & learning Biojava from > > Internet. Can you sent me some important pdf documents on BioJava. > > Thank You > > ANANT JAIN > > DYPBBI, PUNE, INDIA > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > Richard Holland, BSc MBCS > Finance Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From anantpossible at gmail.com Tue Feb 17 00:55:00 2009 From: anantpossible at gmail.com (Anant Jain) Date: Tue, 17 Feb 2009 11:25:00 +0530 Subject: [Biojava-l] A little problem Message-ID: Good Morning, i want to retrieve dna sequnce from a GenBank file. So, I am using SeqIOTools.redGenbank(br) method. There is one flaw, example of tutorials says that i will return a sequence but its returning SequenceIterator. That's not a problem, we can get the sequence using nextSequence() method. Now, i want to print the sequence which i have got from genbank file,so i used a for loop (below) for (int pos = 1;pos References: Message-ID: <499A7C65.5040401@eaglegenomics.com> To convert the sequence into a String, use the seqString() method of the Sequence object. Also you should take a look at the replacement for SeqIOTools - RichSequence.IOTools - as this is more up-to-date and handles file parsing more sensibly. cheers, Richard Anant Jain wrote: > > Good Morning, > i want to retrieve dna sequnce from a GenBank file. So, I am using > SeqIOTools.redGenbank(br) method. There is one flaw, example of > tutorials says that i will return a sequence but its returning > SequenceIterator. > That's not a problem, we can get the sequence using nextSequence() method. > > Now, i want to print the sequence which i have got from genbank file,so > i used a for loop (below) > > for (int pos = 1;pos { > > System.out.println(seq.symbolAt(pos)); > > } > > > > Problem 1: It prints the nucleotides in random order, means i want it > from begining to last as i did in loop > plz help me out, because i want to write the sequence in file > > > > Thank You > Anant Jain > PUNE, INDIA -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From crackeur at comcast.net Wed Feb 18 22:25:13 2009 From: crackeur at comcast.net (crackeur at comcast.net) Date: Thu, 19 Feb 2009 03:25:13 +0000 (UTC) Subject: [Biojava-l] [ANN]VTD-XML 2.5 In-Reply-To: <499A7C65.5040401@eaglegenomics.com> Message-ID: <2083482873.1632131235013913327.JavaMail.root@sz0167a.emeryville.ca.mail.comcast.net> VTD-XML 2.5 is now released. Please go to https://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172&release_id=661376 ?to download the latest version. Changes from Version 2.4 (2/2009) * Added separate VTD indexing generating and loading (see http://vtd-xml.sf.net/persistence.html for further info) * Integrated extended VTD supporting 256 GB doc (In Java only). * Added duplicateNav() for replicate multiple VTDNav instances sharing XML, VTD and LC buffer (availabe in Java and C#). * Various bug fixes and enhancements From aumanga at biggjapan.com Wed Feb 18 21:50:01 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Thu, 19 Feb 2009 11:50:01 +0900 Subject: [Biojava-l] Saving back chromatogram? Message-ID: <499CC8D9.4090906@biggjapan.com> Greetings all, I was able to display chromatogram using ABIFChromatogram class . Now what I want to implement is a chromatogram Editor,where user can edit base calls and save aback to AB1 files. Is there a way to do this in BioJava? Best Regards, umanga From ayates at ebi.ac.uk Thu Feb 19 04:23:00 2009 From: ayates at ebi.ac.uk (Andy Yates) Date: Thu, 19 Feb 2009 09:23:00 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499CC8D9.4090906@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> Message-ID: <499D24F4.8040607@ebi.ac.uk> Hi Umanga, Unfortunately there is no write feature available in the BioJava API. My advice would be to store these new basecalls in a separate file & look into using the Staden IO package which does support write functions (not sure if it will write into AB1 but it will do SCF & ZTR). Sadly this is written in C so you will have to write either some glue code with JNI to get it working or write a small C program to munge an AB1 trace & your new file together. Sorry I can't be of any more help, Andy Ashika Umanga Umagiliya wrote: > Greetings all, > > I was able to display chromatogram using ABIFChromatogram class . > Now what I want to implement is a chromatogram Editor,where user can > edit base calls and save aback to AB1 files. > Is there a way to do this in BioJava? > > Best Regards, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From holland at eaglegenomics.com Thu Feb 19 03:23:31 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 19 Feb 2009 08:23:31 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499CC8D9.4090906@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> Message-ID: <499D1703.4020803@eaglegenomics.com> No, BioJava does not include the ability to write ABI files. Technically you could write your own code to do it though because the file format is fully understood by the parser and is formally described by a paper linked to from the Javadoc for ABIFChromatogram. By combining the information from the paper with the code from the parser, it should be possible to create a writer. Richard Ashika Umanga Umagiliya wrote: > Greetings all, > > I was able to display chromatogram using ABIFChromatogram class . > Now what I want to implement is a chromatogram Editor,where user can > edit base calls and save aback to AB1 files. > Is there a way to do this in BioJava? > > Best Regards, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From anantpossible at gmail.com Fri Feb 20 00:40:17 2009 From: anantpossible at gmail.com (Anant Jain) Date: Fri, 20 Feb 2009 11:10:17 +0530 Subject: [Biojava-l] Regarding BioJava Message-ID: Greetings all, Do we have any BioJava runtime eviorment like jre coz if i give a s/w to anybody then he also include all the jar files in his class path. If i am compile and run my biojava program from my editplus then its working but if i run it from command prompt then its giving error like unable to load PDBFilereader.class. plz tel me how to run our program through cmd From aumanga at biggjapan.com Fri Feb 20 00:48:45 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Fri, 20 Feb 2009 14:48:45 +0900 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499D24F4.8040607@ebi.ac.uk> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> Message-ID: <499E443D.30206@biggjapan.com> Greetings all, Thank you for all your answers. What I want to do is ,after modifying callbases, I need to use the sequences with 'phrad' foro DNA assembly.For 'phrad', I need to give Fasta files and Quality files. I can modify callbases using my chromatogram editor , and save the new sequence in Fasta file.But my problem is , If I change the original callbases from AB1 file ,does it effect the Qualitiy files also? Or can I use the original Quality files generated by 'phred' with the new callbases. Here is a image demonstrating my scenario: http://img3.imageshack.us/img3/3564/f92df0815fab523f3c72aa3qx7.png Step (2) Ab1 files are passed into 'phred' and Fasta,Quality files are generated. Step (3) If user want to edit callbases , he can use the chromatogram editor.Then the Fasta files generated by 'phred' are replaced with new onces generated by editor. My problem is can I use the same Quality files ,with the modified callbases ? Sorry if this is off the topic. Thanks in advance, Umanga Andy Yates wrote: > Hi Umanga, > > Unfortunately there is no write feature available in the BioJava API. My > advice would be to store these new basecalls in a separate file & look > into using the Staden IO package which does support write functions (not > sure if it will write into AB1 but it will do SCF & ZTR). Sadly this is > written in C so you will have to write either some glue code with JNI to > get it working or write a small C program to munge an AB1 trace & your > new file together. > > Sorry I can't be of any more help, > > Andy > > Ashika Umanga Umagiliya wrote: > >> Greetings all, >> >> I was able to display chromatogram using ABIFChromatogram class . >> Now what I want to implement is a chromatogram Editor,where user can >> edit base calls and save aback to AB1 files. >> Is there a way to do this in BioJava? >> >> Best Regards, >> umanga >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > From russ at kepler-eng.com Fri Feb 20 01:16:46 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 19 Feb 2009 23:16:46 -0700 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499E443D.30206@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> Message-ID: <200902192316.46735.russ@kepler-eng.com> On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: > My problem is can I use the same Quality files ,with the modified > callbases ? Edit the quality data in parallel with the chromatogram. I would assume that the editor is relatively sure of their edits, so I would give them a high confidence level when I generate the trace fasta and quality file. From aumanga at biggjapan.com Fri Feb 20 01:54:15 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Fri, 20 Feb 2009 15:54:15 +0900 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <200902192316.46735.russ@kepler-eng.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> <200902192316.46735.russ@kepler-eng.com> Message-ID: <499E5397.4060502@biggjapan.com> Cheers. I still haven't implemented the editor.I assume for basecall editing, I can use 'Edit' class. Like: ABIFChromatogram a; .. .. a.getBaseCalls().edit(new Edit(.......)) And for Quality values, I am hoping to read and store Qualitly files generated by 'phred'.And when user edit the basecall, I am planing to edit stored quality values accordinly. Finally the fasta file is generated using ABIFChromatogram.getBaseCalls()... and Quality file will be generated using the structure I used above. Any issues with this approach? Many thanks, Umanga Russ Kepler wrote: > On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: > > >> My problem is can I use the same Quality files ,with the modified >> callbases ? >> > > Edit the quality data in parallel with the chromatogram. I would assume that > the editor is relatively sure of their edits, so I would give them a high > confidence level when I generate the trace fasta and quality file. > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From holland at eaglegenomics.com Fri Feb 20 03:43:23 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 20 Feb 2009 08:43:23 +0000 Subject: [Biojava-l] Regarding BioJava In-Reply-To: References: Message-ID: <499E6D2B.90904@eaglegenomics.com> Like any other Java software, you need to have all the BioJava jars on your classpath to run it from the command line. If you want to avoid having to do that, you can copy the jars into $JAVA_HOME/ext. Please read Sun's documentation on the Java programming language to learn more about the classpath. Richard Anant Jain wrote: > Greetings all, > > Do we have any BioJava runtime eviorment like jre coz > if i give a s/w to anybody then he also include all the jar files in his > class path. > > If i am compile and run my biojava program from my editplus then its working > but if i run it from command prompt then its giving error like unable to > load PDBFilereader.class. plz tel me how to run our program through cmd > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From ayates at ebi.ac.uk Fri Feb 20 06:02:52 2009 From: ayates at ebi.ac.uk (Andy Yates) Date: Fri, 20 Feb 2009 11:02:52 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499E5397.4060502@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> <200902192316.46735.russ@kepler-eng.com> <499E5397.4060502@biggjapan.com> Message-ID: <499E8DDC.7090100@ebi.ac.uk> As far as I'm aware no there shouldn't be a problem with this solution. However have you investigated any other mechanisms for doing this kind of contig editing/assembly work? I know that people at the Sanger Centre use a program called gap4 which has powered just about every assembly they have done. It's available from: http://www.sanger.ac.uk/Software/production/staden/ (bit of info here) http://staden.sourceforge.net/staden_home.html Have a look at it as I feel that this is exactly what you are attempting to do. Andy Ashika Umanga Umagiliya wrote: > Cheers. > > > I still haven't implemented the editor.I assume for basecall editing, I > can use 'Edit' class. > > Like: > > ABIFChromatogram a; > .. > .. > a.getBaseCalls().edit(new Edit(.......)) > > > And for Quality values, I am hoping to read and store Qualitly files > generated by 'phred'.And when user edit the basecall, I am planing to > edit stored quality values accordinly. > Finally the fasta file is generated using > ABIFChromatogram.getBaseCalls()... > and Quality file will be generated using the structure I used above. > > Any issues with this approach? > > Many thanks, > Umanga > > > Russ Kepler wrote: >> On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: >> >> >>> My problem is can I use the same Quality files ,with the modified >>> callbases ? >>> >> >> Edit the quality data in parallel with the chromatogram. I would >> assume that the editor is relatively sure of their edits, so I would >> give them a high confidence level when I generate the trace fasta and >> quality file. >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From jeedward at yahoo.com Fri Feb 20 10:27:09 2009 From: jeedward at yahoo.com (John Edward) Date: Fri, 20 Feb 2009 07:27:09 -0800 (PST) Subject: [Biojava-l] Draft paper submission deadline extended: BCBGC-09 Message-ID: <483752.12181.qm@web45903.mail.sp1.yahoo.com> Draft paper submission deadline extended: BCBGC-09 ? The deadline for draft paper submission at the 2009 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-09) (website: http://www.PromoteResearch.org ) is extended due to numerous requests from the authors. The conference will be held during July 13-16 2009 in Orlando, FL, USA. We invite draft paper submissions. The conference will take place at the same time and venue where several other international conferences are taking place. The other conferences include: ????????? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-09) ????????? International Conference on Automation, Robotics and Control Systems (ARCS-09) ????????? International Conference on Enterprise Information Systems and Web Technologies (EISWT-09) ????????? International Conference on High Performance Computing, Networking and Communication Systems (HPCNCS-09) ????????? International Conference on Information Security and Privacy (ISP-09) ????????? International Conference on Recent Advances in Information Technology and Applications (RAITA-09) ????????? International Conference on Software Engineering Theory and Practice (SETP-09) ????????? International Conference on Theory and Applications of Computational Science (TACS-09) ????????? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-09) ? The website http://www.PromoteResearch.org contains more details. ? Sincerely John Edward Publicity committee From anantpossible at gmail.com Mon Feb 23 04:45:45 2009 From: anantpossible at gmail.com (Anant Jain) Date: Mon, 23 Feb 2009 15:15:45 +0530 Subject: [Biojava-l] problem in alignment Message-ID: Greeetings all, i am aligning two protein sequences using BLOSUM62 substitution matrix but an runtime error is coming which is " this tokenization does not contain character ' * ' " some IllegalSymbolException please suggest me solution or any substitution matrix or do i have to change my sequence format. plz rply From andreas.draeger at uni-tuebingen.de Mon Feb 23 05:48:20 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Mon, 23 Feb 2009 11:48:20 +0100 Subject: [Biojava-l] problem in alignment In-Reply-To: References: Message-ID: <49A27EF4.1050109@uni-tuebingen.de> Dear Anant Jain, > i am aligning two protein sequences using BLOSUM62 > substitution matrix but an runtime error is coming which is " this > tokenization does not contain character ' * ' " some IllegalSymbolException > > please suggest me solution or any substitution matrix > > or do i have to change my sequence format. This problem occurs very frequently because in BioJava the * symbol (the termination symbol) belongs to the alphabet PROTEIN_TERM and not to the alphabet PROTEIN. Please use the correct alphabet and you'll be fine. Cheers Andreas From andreas.draeger at uni-tuebingen.de Mon Feb 23 15:48:54 2009 From: andreas.draeger at uni-tuebingen.de (Andreas =?iso-8859-1?b?RHLkZ2Vy?=) Date: Mon, 23 Feb 2009 21:48:54 +0100 Subject: [Biojava-l] problem in alignment In-Reply-To: References: <49A27EF4.1050109@uni-tuebingen.de> Message-ID: <20090223214854.959059s53gtvecvq@webmail.uni-tuebingen.de> Dear Anant Jain, Yeah, the problem is that the substitution matrices, e.g., BLOSUM 50, contain the *-symbol. So you'll definitely need the PROTEIN_TERM alphabet when doing protein alignments. Just try and let me know if it works. Cheers Andreas Dr?ger Dipl.-Bioinform. Andreas Dr?ger Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Sand 1 72076 T?bingen Germany Phone: +49-7071-29-70436 Fax: +49-7071-29-5091 From andreas.draeger at uni-tuebingen.de Tue Feb 24 02:34:28 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Tue, 24 Feb 2009 08:34:28 +0100 Subject: [Biojava-l] Regarding problen in Alignment In-Reply-To: References: Message-ID: <49A3A304.4010004@uni-tuebingen.de> Dear Anant Jain, Sorry for this misunderstanding. There is no substitution matrix "PROTEIN_TERM". The constructor of the BioJava SubstitutionMatrix class requires the following arguments: FiniteAlphabet alpha and File matrixFile. As the alphabet you should use the alphabet "PROTEIN_TERM" for matrixes of protein sequences like BLOSUM etc. In the BioJava repository there is a newer version of this class that provides a function to parse a matrix without explicitely providing an alphabet (it guesses the alphabet of the matrix). I hope this helps. Cheers Andreas Dr?ger Anant Jain wrote: > Thank You Sir, > I have searched through ftp/...blast/matrices, > but did not got any file substitution matrix like PROTEIN_TERM, if u > have can u send me it OR tell me link from where i can download the it > > Thank You > Anant jain > DYPBI > Pune, India. > From sauloal at gmail.com Tue Feb 24 10:53:10 2009 From: sauloal at gmail.com (Saulo Alves) Date: Tue, 24 Feb 2009 16:53:10 +0100 Subject: [Biojava-l] Sequence Annotation on Gui Message-ID: Hello, I'm new here and in biojava and i have been strugling with this problem. I have a chromossome with its gene annotations and i want to plot it in a GUI. I used the tutorial to make the basic setup and i can have a good picture of the chromossome and its genes position. The problem is: how to plot the name of each gene along the arrow which represents each gene? thanks in advance, S. From andreas.draeger at uni-tuebingen.de Wed Feb 25 01:54:29 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Wed, 25 Feb 2009 07:54:29 +0100 Subject: [Biojava-l] Regarding problen in Alignment In-Reply-To: References: <49A3A304.4010004@uni-tuebingen.de> Message-ID: <49A4EB25.5070207@uni-tuebingen.de> Anant Jain wrote: > Thank Sir, > > So should i use this > > SubstituionMatrix matrix = new SubstitutionMatrix("PROTEIN_TERM",new > File (BLOSUM62.50")); > > > Anant Jain > DYPBBI, > Pune,India Dear Anant Jain, "PROTEIN_TERM" is a String and not a FiniteAlphabet. Please have a look at the following example, where the alphabet "DNA" is used. Just replace "DNA" by "PROTEIN_TERM" and it will work for you: http://biojava.org/wiki/BioJava:CookBook:DP:PairWise2 Cheers Andreas From aumanga at biggjapan.com Wed Feb 25 02:44:24 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Wed, 25 Feb 2009 16:44:24 +0900 Subject: [Biojava-l] Biojava Parsers : Apply quality values for contig ? Message-ID: <49A4F6D8.9010307@biggjapan.com> Greetings all, I am using 'phred/phrap' to assemble DNA sequences ,and 'phrap' generates contig file and a contig-quality files for an assembly. Now I want to parse these two files and generate final contig , by removing Bases with '0' quality values. For example : CGACTATG + 0 42 54 59 48 0 0 0 > _GACT____ Why I want to do this is; because only this "masking" will give the similar contig that of which generated by ChromasPro. I can use Fasta-parser to parse contig file.But I wonder whether theres anyway to handle parsing of Quality file in BioJava. Below I have give the structures of two file types: thanks in advance, Umanga contig file: ------------ >seqs_fasta.Contig1 TTGGAGAGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACA CATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGCTTTGCTGACGAGT GGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAAC TACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGG GACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTA GGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGA TGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCA GCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGC GTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAG GGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCAC CGGCTAATTCCGTGCCAGCAGCCGCGGTAATATTNTTATTCTTTATGTAT ACATATTCTTTTTACTTTATTCTATTAAATTTATTCTTTCATAATTAAAC CTTCCCTTACACCCATTCCACCTCCCATCCCTCTTCCCCTCCCACTCTCC ATCTCATATGGCGTTCGCGCCTCTCTCTTCATCTCCTCCTATATTTATTC TAACTTCTTTCATCTCAATCATTTCTTCTGTCTCATCCTTCCATTCTTTC CATGATCTCCCCCATTGTCATGTCTTCAAAAAACCACACAAAACACTAGA ATCTTTTCTTATTACACACAAGTATATACAATTTTTAACAATCCATTAAA ACACACACAACACCTAGCAATCAACAACGCTACCATCCCCAATATTCTCT GTTCTCCTCTCTTTCTCCGCGTGCATCTGCGCACTACTCTCTAATTTCAT CTCTATTATCTTTTTTTCTTAACTCATCCGCATACATCCAAGACTCTAGA CCCATTTCTCGCCTCTTTCATTTACTGCCGATACAGAGCTTATAAATTCT ATATCATTTATCCACACTCATTATTAAATAGGCTGACACCTCTAACCGTC CACTACACCACCTTTCCCATGCCATCTCCCTAACACTGCACTCATCCGTA ACTTCCTACTCTACCCTCTCTTTCTTTCCTTACTTTCTTTTCTTTCTCTT ACATTTTTATTTAAAATTCCTCTTTTAGCCTCTATTTTCTGTTATCTACT TTTCTCCTAAATTCCCCCTATTCTTCACGTCCCATACCTATCCCTACCAC CACCACTACCACCCCTCTCTTCATTCTACTCGCTCTAAACCCTCCACCCT CCCCTCCTTGCTCTTATGTATCTCCTCATCTTTTAAT quality file ------------ >seqs_fasta.Contig1 0 23 23 33 33 33 33 33 31 41 47 47 47 47 47 47 47 50 47 47 57 59 59 59 42 42 35 42 42 54 59 48 48 48 48 48 48 54 57 57 57 57 57 54 54 57 54 54 54 74 74 74 74 59 57 57 57 57 72 72 84 76 73 72 72 72 79 81 74 74 62 50 50 50 59 39 43 32 35 32 43 58 44 48 70 70 58 73 55 69 67 87 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 90 90 90 90 90 90 90 85 87 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 77 77 77 81 81 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 74 74 85 90 90 90 90 90 90 90 90 90 90 90 90 83 83 90 90 90 90 90 90 90 75 83 83 89 89 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 72 72 72 57 57 43 37 37 43 72 72 72 72 72 72 90 90 90 90 90 90 90 86 90 90 90 90 90 90 90 79 85 83 90 90 90 89 87 87 90 90 90 90 67 67 79 78 90 86 88 82 73 68 65 61 59 63 62 68 71 72 59 56 41 35 30 30 28 32 41 47 40 56 49 42 49 51 50 37 37 39 39 37 52 54 51 46 20 20 27 24 32 24 20 20 21 24 16 19 19 33 29 22 23 12 11 11 12 20 23 40 32 31 28 22 13 13 18 26 28 28 34 28 25 24 28 23 26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -- ??? ???? ????? ???????????????????BiGG) ?140-0001 ?????????3-6-9 ??????8F TEL:03-6679-8763 FAX:03-6679-8764 From watson at ebi.ac.uk Wed Feb 25 06:07:49 2009 From: watson at ebi.ac.uk (James Watson) Date: Wed, 25 Feb 2009 11:07:49 +0000 Subject: [Biojava-l] Java programmatic access course at the EMBL-EBI Message-ID: <49A52685.7090201@ebi.ac.uk> We have a hands-on training course that might be of interest being run here at the European Bioinformatics Institute. Further details can be found at http://www.ebi.ac.uk/training/handson/ Title: Programmatic Access: to biological databases (Java) Date: 27-29 April 2009 Venue: EMBL-EBI, Hinxton, Nr Cambridge, CB10 1SD, UK Organisers: Samuel Patient & James Watson Registration Deadline: 30th March 2009 - 12 noon (GMT) Cost: ?50.00 (no travel or accomodation included) James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 16-18 March 2009: Sequence to Genes - Genome Informatics 27-29 April 2009: Programmatic access to biological databases 11-15 May 2009: A Walkthrough EBI Bioinformatics Resources From koen.bruynseels at cropdesign.com Wed Feb 25 06:18:32 2009 From: koen.bruynseels at cropdesign.com (koen.bruynseels at cropdesign.com) Date: Wed, 25 Feb 2009 12:18:32 +0100 Subject: [Biojava-l] Koen Bruynseels is out of the office. Message-ID: I will be out of the office starting 02/24/2009 and will not return until 03/02/2009. I will respond to your message when I return. From andreas at sdsc.edu Wed Feb 25 12:54:06 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 25 Feb 2009 09:54:06 -0800 Subject: [Biojava-l] biojava 1.7 release schedule Message-ID: <59a41c430902250954k4696f868hb5b8b50fa247a03a@mail.gmail.com> Hi I would like to propose the following release plan for biojava 1.7: * the next couple of weeks: commit missing patches, write junit tests and improve documentation (everybody with write access) * Wed. April 8th: code freeze (declared on biojava-dev by me) final checks. at this point all unit tests should pass without problems * Sat. April 11th : I will branch the svn, copy the release files on the biojava site, and write the announce email Andreas From andreas.prlic at gmail.com Thu Feb 26 11:16:56 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 26 Feb 2009 08:16:56 -0800 Subject: [Biojava-l] BioJava and backbone amino acids In-Reply-To: <49A63722.40002@wp.pl> References: <49A63722.40002@wp.pl> Message-ID: <59a41c430902260816t1280caa1tb770a174d32fcc03@mail.gmail.com> Hi Michal, You could use the Calc class to calculate all the distances of Atoms that are in proximity of a the ligand. Andreas On Wed, Feb 25, 2009 at 10:30 PM, Michal Lorenc wrote: > Dear Andreas, > do you know how it is possible to find backbone amino acids around the > ligand with BioJava or do you know another software? > > Thank you in advance. > > Best regards, > > Michal > From bopfannkuche at gmx.de Fri Feb 27 05:36:38 2009 From: bopfannkuche at gmx.de (=?ISO-8859-15?Q?Bj=F6rn_Ole_Pfannkuche?=) Date: Fri, 27 Feb 2009 11:36:38 +0100 Subject: [Biojava-l] Zuker Algorithm Message-ID: <49A7C236.6090807@gmx.de> Hello! I' m reading this mailinglist for quite a while and I often learned very nice tricks on different stuff *G* Now I have a little problem/question of my own: Has anyone implemented the Zuker-Algorithm for RNA Folding so far? I need a Java Version for a software project and due to the fact that all of you are developing software I hope that I do not have to implement it again in the case one of you have already done. If someone can help me I would be very delighted, thanks in advance! Bj?rn From andreas.prlic at gmail.com Fri Feb 27 22:15:44 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Fri, 27 Feb 2009 19:15:44 -0800 Subject: [Biojava-l] BioJava and backbone amino acids In-Reply-To: <49A79EAD.7030000@wp.pl> References: <49A63722.40002@wp.pl> <59a41c430902260816t1280caa1tb770a174d32fcc03@mail.gmail.com> <49A79EAD.7030000@wp.pl> Message-ID: <59a41c430902271915l73a05533lc77d5dc69e439480@mail.gmail.com> Hi Michal, you can get the backbone atoms e.g. by using the StructureTools class: StructureTools.getBackboneAtomArray(Structure s) http://www.biojava.org/docs/api/org/biojava/bio/structure/StructureTools.html Andreas On Fri, Feb 27, 2009 at 12:05 AM, Michal Lorenc wrote: > Hi Andreas, > Thank you for your email. But how can I find backbone amino acids? > > Thank you in advance. > > Best regards, > > Michal > > Andreas Prlic wrote: >> >> Hi Michal, >> >> You could use the Calc class to calculate all the distances of Atoms >> that are in proximity of a the ligand. >> >> Andreas >> >> >> On Wed, Feb 25, 2009 at 10:30 PM, Michal Lorenc wrote: >>> >>> Dear Andreas, >>> do you know how it is possible to find backbone amino acids around the >>> ligand with BioJava or do you know another software? >>> >>> Thank you in advance. >>> >>> Best regards, >>> >>> Michal >>> >> >> > From raphael.andre.bauer at gmail.com Tue Feb 3 09:28:52 2009 From: raphael.andre.bauer at gmail.com (=?UTF-8?Q?Raphael_Andr=C3=A9_Bauer?=) Date: Tue, 3 Feb 2009 10:28:52 +0100 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain Message-ID: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> Hey everybody, suppose the following PDB file: .... ATOM 2847 OD1 ASP A 370 14.470 9.116 16.553 1.00 57.47 O ATOM 2848 OD2 ASP A 370 12.601 9.313 15.410 1.00 57.77 O ATOM 2849 N ASN A 371 17.114 9.968 15.979 1.00 46.04 N ATOM 2850 CA ASN A 371 18.177 9.691 16.946 1.00 46.67 C ATOM 2851 C ASN A 371 19.555 10.072 16.403 1.00 46.52 C ATOM 2852 O ASN A 371 19.848 11.250 16.177 1.00 43.37 O ATOM 2853 CB ASN A 371 18.174 8.209 17.345 1.00 49.12 C ATOM 2854 CG ASN A 371 16.969 7.829 18.189 1.00 52.62 C ATOM 2855 OD1 ASN A 371 16.762 6.652 18.502 1.00 55.93 O ATOM 2856 ND2 ASN A 371 16.168 8.824 18.568 1.00 52.06 N TER 2857 ASN A 371 HETATM 2858 ZN ZN A 400 43.465 12.734 15.463 1.00 24.94 ZN HETATM 2859 ZN ZN A 401 46.414 14.165 16.934 1.00 30.14 ZN HETATM 2860 N9 ADE A1114 50.418 9.820 20.124 1.00 80.38 N HETATM 2861 C8 ADE A1114 50.489 11.169 20.334 1.00 80.76 C HETATM 2862 N7 ADE A1114 49.492 11.773 19.748 1.00 80.58 N HETATM 2863 C5 ADE A1114 48.715 10.855 19.124 1.00 79.77 C .... 1) If I use //struc being the parsed pdb file returnChain = struc.getChainByPDB("A", 1); I get chain A EXCEPT everything after the TER (ZN and ADE is excluded in this example). 2) If I iterate over the complete file using something like //struc being the parsed pdb file int nrModels = struc.nrModels(); for (int modelNr = 0; modelNr < nrModels; modelNr++) { List chains = struc.getModel(modelNr); int nrChains = chains.size(); for (int chainNr = 0; chainNr < nrChains; chainNr++) { //this chain contains also the ZN and the ADE So maybe I am getting something wrong here. Is this effect wanted by the PDB parser? Thanks! Raphael From andreas.prlic at gmail.com Tue Feb 3 15:30:48 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Tue, 3 Feb 2009 07:30:48 -0800 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain In-Reply-To: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> References: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> Message-ID: <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> Hi Rafael, > > suppose the following PDB file: > ... > > 1) If I use > > //struc being the parsed pdb file > returnChain = struc.getChainByPDB("A", 1); > > I get chain A EXCEPT everything after the TER (ZN and ADE is excluded > in this example). This seems wrong. You should get the full chain here. TERs are getting ignored. Let me check what might cause this. You are on the latest version from svn? > 2) If I iterate over the complete file using something like > > //struc being the parsed pdb file > int nrModels = struc.nrModels(); > > for (int modelNr = 0; modelNr < nrModels; modelNr++) { > List chains = struc.getModel(modelNr); > int nrChains = chains.size(); > for (int chainNr = 0; chainNr < nrChains; chainNr++) { > > //this chain contains also the ZN and the ADE That's what I would expect... Andreas > > > > > So maybe I am getting something wrong here. Is this effect wanted by > the PDB parser? > > > Thanks! > > > Raphael > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From raphael.andre.bauer at gmail.com Tue Feb 3 19:01:09 2009 From: raphael.andre.bauer at gmail.com (=?UTF-8?Q?Raphael_Andr=C3=A9_Bauer?=) Date: Tue, 3 Feb 2009 20:01:09 +0100 Subject: [Biojava-l] Different chain lengths using struc.getChainByPDB and iterating over a structure and it's chain In-Reply-To: <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> References: <9b46aa30902030128nfea24f5x94e7882a8d80e542@mail.gmail.com> <1CAC30DC-06D7-4B0D-8799-0692C9591B11@gmail.com> Message-ID: <9b46aa30902031101r77aa6236qc6e92c4140f39ca8@mail.gmail.com> On Tue, Feb 3, 2009 at 4:30 PM, Andreas Prlic wrote: ... >> 1) If I use >> >> //struc being the parsed pdb file >> returnChain = struc.getChainByPDB("A", 1); >> >> I get chain A EXCEPT everything after the TER (ZN and ADE is excluded >> in this example). > > This seems wrong. You should get the full chain here. TERs are getting > ignored. Let me check what might cause this. You are on the latest version > from svn? Thanks for the quick response. And sorry for having bothered you - after writing a quick unit test it became pretty clear that it is my own fault. I thought I double checked that yesterday (and I should write the unit test before writing to the list - I know). The "supposed-to-be-bug-but-was-my-own-failure" came from setParseCAOnly(). Sorry and thanks again! Raphael From aumanga at biggjapan.com Thu Feb 5 05:57:37 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Thu, 05 Feb 2009 14:57:37 +0900 Subject: [Biojava-l] [Off the Topic] Homology prediction of large number of sequences! Message-ID: <498A7FD1.2060203@biggjapan.com> Greetings all, I have about 60,000 sequences and want to predict the 3D structure (Homology modelling) for all the structures.I decided to use Modeller first , and seems that would take months to predict all those structures. Is there any HighPerformance prediction server which I have use for this? Not submitting data through webbased interface , I want to automate the process , cuz 60K is alot of data! Thanks in advance, umanga From markjschreiber at gmail.com Thu Feb 5 13:51:39 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Thu, 5 Feb 2009 21:51:39 +0800 Subject: [Biojava-l] [Off the Topic] Homology prediction of large number of sequences! In-Reply-To: <498A7FD1.2060203@biggjapan.com> References: <498A7FD1.2060203@biggjapan.com> Message-ID: <93b45ca50902050551s11574e7br9526f08ddae156f4@mail.gmail.com> I don't know about high performance but the problem is very parallel so if you can get your hands on a cluster you can run one sequence on each processor. Maybe there is a cloud compute facility you could use if a cluster is not available. You might also want to try a pilot experiment first. There is bound to be some redundancy in some of those proteins which will give nearly identical models. - Mark On Thu, Feb 5, 2009 at 1:57 PM, Ashika Umanga Umagiliya wrote: > > Greetings all, > > I have about 60,000 sequences and want to predict the 3D structure (Homology modelling) for all the structures.I decided to use Modeller first , and seems that would take months to predict all those structures. > Is there any HighPerformance prediction server which I have use for this? Not submitting data through webbased interface , I want to automate the process , cuz 60K is alot of data! > > Thanks in advance, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From willishf at ufl.edu Fri Feb 6 19:39:17 2009 From: willishf at ufl.edu (Scooter Willis) Date: Fri, 6 Feb 2009 14:39:17 -0500 Subject: [Biojava-l] Phylogenetic Trees Message-ID: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> I was looking for Java code to construct a phylogenetic tree from aligned sequence data. I came across org.biojavax.bio.phylo as an experimental package in the latest release of BioJava. The docs were thin and couldn't find any code examples so was wondering if anyone has used the package with success. I also couldn't decide if the package is intended to load pre-constructed trees as Java objects or could be used to construct a tree from aligned sequence data. Does anyone know of an accepted Java package for tree construction besides what is in JalView? Thanks Scooter From simon.rayner.cn at gmail.com Sun Feb 8 09:45:46 2009 From: simon.rayner.cn at gmail.com (simon rayner) Date: Sun, 8 Feb 2009 04:45:46 -0500 Subject: [Biojava-l] alignments Message-ID: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> Hi, i have a question about Alignment and FlexibleAlignment objects. Basically, i have a sequence alignment in FASTA format so followed a code sample i found from a previous posted query: BufferedReader br = new BufferedReader(new FileReader("file.txt")); FastaAlignmentFormat faf = new FastaAlignmentFormat(); Alignment aligned = faf.read( br ); so now i have an alignment stored in an Alignment object. 1. I do something with these sequences using the access methods for the Alignment object. something like java.util.Iterator iter = l.listIterator(); while(iter.hasNext()) { String currLabel = (String) iter.next(); String currAA = aligned.symbolListForLabel(currLabel) .seqString().substring(base, base+1); } 2. Now i want to do the same analysis but on a random sample of sequences from this alignment. So to get a subset of these sequences i created a FlexibleAlignment and grabbed the sequences i need from the original alignment... Alignment al; Iterator l = al.symbolListIterator(); java.util.List names = al.getLabels(); Iterator n = names.listIterator(); java.util.ArrayList seqList = new java.util.ArrayList(1); String refs = ""; while(refs.length() < al.length()) { refs = refs.concat("x"); } try{ FlexibleAlignment FA = new FlexibleAlignment(seqList); SymbolList aa = ProteinTools.createProtein(refs); seqList.add( new SimpleAlignmentElement( "reference", aa, new RangeLocation(1, al.length())) ); while (l.hasNext()) { try{ SymbolList sal = (SymbolList)l.next(); String name = (String)n.next(); FA.addSequence(new SimpleAlignmentElement(name, sal, new RangeLocation(1, al.length()))); } catch (org.biojava.bio.BioException ex) { System.out.print("couldn't add sequence to alignment\n"); System.err.print(ex); } } this.aligned = FA; } catch (org.biojava.bio.BioException ex) { System.out.print("couldn't add sequence to alignment\n"); System.err.print(ex); } So now i have my alignment of random sequences in a FlexibleAlignment object, but my original analysis was written for an Alignment object (because that's the format that FastaAlignmentFormat spat back at me. So, can i cast a FlexibleAlignment object into an Alignment object? or am i just going about it the wrong way? p.s. Why are there so many diverse Alignment related objects? (AbstractULAlignment, AbstractULAlignment.SubULAlignment, FlexibleAlignment, RelabeledAlignment, EmptyPairwiseAlignment, SimpleAlignment, HappyAlignment, SadAlignment..) Is there any documentation (other than the standard java docs) to clarify which object when? thanks! From holland at eaglegenomics.com Mon Feb 9 14:43:03 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 09 Feb 2009 14:43:03 +0000 Subject: [Biojava-l] Phylogenetic Trees In-Reply-To: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> References: <7ceb4beb0902061139l6d49add6qb592cd389b0e176b@mail.gmail.com> Message-ID: <499040F7.3050009@eaglegenomics.com> The package you found in biojavax was written with the intention of parsing existing tree structures from NEXUS files, and to perform basic analyses of them (neighbour joining, etc.). The code can represent trees regardless of their source, and can output them as NEXUS files, but there are no algorithm implementations in that package for constructing new trees and so you'd probably have to write those from scratch. cheers, Richard Scooter Willis wrote: > I was looking for Java code to construct a phylogenetic tree from > aligned sequence data. I came across org.biojavax.bio.phylo as an > experimental package in the latest release of BioJava. The docs were > thin and couldn't find any code examples so was wondering if anyone > has used the package with success. > > I also couldn't decide if the package is intended to load > pre-constructed trees as Java objects or could be used to construct a > tree from aligned sequence data. > > Does anyone know of an accepted Java package for tree construction > besides what is in JalView? > > Thanks > > Scooter > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From holland at eaglegenomics.com Mon Feb 9 14:47:45 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Mon, 09 Feb 2009 14:47:45 +0000 Subject: [Biojava-l] alignments In-Reply-To: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> References: <616a29410902080145m2ea0fe0fye091ac646071ac13@mail.gmail.com> Message-ID: <49904211.7010404@eaglegenomics.com> Yes, you can cast it back. In fact you don't need to actually cast it at all, as Java will do that for you automatically. FlexibleAlignment is an implementation of the Alignment interface, which defines general behaviour for all alignments. If your algorithm is written to accept a parameter of type Alignment, then it will accept any object which implements the Alignment interface, including FlexibleAlignment, Happy and SadAlignment, etc. etc. Unfortunately I'm not sure why there are so many different kinds of alignment objects. I think it was something to do with wanting to make some that are modifiable, and others that are read-only, then making different kinds that store the underlying alignment data in different/more efficient/faster/cleverer representations. It does rather over-complicate things, however if your algorithm only requires Alignment as the type of its parameter, then you can safely use all and any of these as they all implement the Alignment interface. cheers, Richard simon rayner wrote: > Hi, > > i have a question about Alignment and FlexibleAlignment objects. > Basically, i have a sequence alignment in FASTA format so followed a code > sample i found from a previous posted query: > > BufferedReader br = new BufferedReader(new FileReader("file.txt")); > FastaAlignmentFormat faf = new FastaAlignmentFormat(); > Alignment aligned = faf.read( br ); > > so now i have an alignment stored in an Alignment object. > > 1. I do something with these sequences using the access methods for > the Alignment object. > something like > > java.util.Iterator iter = l.listIterator(); > while(iter.hasNext()) > { > String currLabel = (String) iter.next(); > String currAA > = aligned.symbolListForLabel(currLabel) > .seqString().substring(base, base+1); > } > > 2. Now i want to do the same analysis but on a random sample of > sequences from this alignment. > So to get a subset of these sequences i created a FlexibleAlignment > and grabbed the sequences > i need from the original alignment... > > Alignment al; > Iterator l = al.symbolListIterator(); > java.util.List names = al.getLabels(); > Iterator n = names.listIterator(); > java.util.ArrayList seqList = new java.util.ArrayList(1); > String refs = ""; > while(refs.length() < al.length()) > { > refs = refs.concat("x"); > } > > try{ > FlexibleAlignment FA = new FlexibleAlignment(seqList); > SymbolList aa = ProteinTools.createProtein(refs); > > seqList.add( > new SimpleAlignmentElement( > "reference", > aa, > new RangeLocation(1, al.length())) > ); > while (l.hasNext()) > { > try{ > SymbolList sal = (SymbolList)l.next(); > String name = (String)n.next(); > > FA.addSequence(new SimpleAlignmentElement(name, > sal, > new RangeLocation(1, al.length()))); > > } > catch (org.biojava.bio.BioException ex) > { > System.out.print("couldn't add sequence to alignment\n"); > System.err.print(ex); > } > } > this.aligned = FA; > } > catch (org.biojava.bio.BioException ex) > { > System.out.print("couldn't add sequence to alignment\n"); > System.err.print(ex); > } > > So now i have my alignment of random sequences in a FlexibleAlignment > object, but my original analysis > was written for an Alignment object (because that's the format that > FastaAlignmentFormat spat back at me. > > So, can i cast a FlexibleAlignment object into an Alignment object? or > am i just going about it the wrong way? > > p.s. Why are there so many diverse Alignment related objects? > (AbstractULAlignment, > AbstractULAlignment.SubULAlignment, FlexibleAlignment, RelabeledAlignment, > EmptyPairwiseAlignment, SimpleAlignment, HappyAlignment, > SadAlignment..) Is there any documentation (other than > the standard java docs) to clarify which object when? > > thanks! > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From aumanga at biggjapan.com Tue Feb 10 01:27:34 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Tue, 10 Feb 2009 10:27:34 +0900 Subject: [Biojava-l] [Off the topic] Selecting template from clustering tree? Message-ID: <4990D806.2030204@biggjapan.com> Greetings, I use 'Modeller' for homology modeling and, during the process I get a clustering tree shown as following.And I have to select the best template using this tree.I was wondering whether there are any tools/features in BioJava that I can use for this? I don't know the conditions to select the best template (I am from Computer science background),but seems I can you fuzzylogic ,if I know the criteria well. Any tips? Best Regards, umanga ------------------------------------------------------------------------------------------------- Sequence identity comparison (ID_TABLE): Diagonal ... number of residues; Upper triangle ... number of identical residues; Lower triangle ... % sequence identity, id/min(length). 1b8pA @11bdmA @11civA @25mdhA @27mdhA @21smkA @2 1b8pA @1 327 194 147 151 153 49 1bdmA @1 61 318 152 167 155 56 1civA @2 45 48 374 139 304 53 5mdhA @2 46 53 42 333 139 57 7mdhA @2 47 49 87 42 351 48 1smkA @2 16 18 17 18 15 313 Weighted pair-group average clustering based on a distance matrix: .----------------------- 1b8pA @1.9 39.0000 | .-------------------------------- 1bdmA @1.8 50.5000 | .------------------------------------ 5mdhA @2.4 55.3750 | | .--- 1civA @2.8 13.0000 | | .---------------------------------------------------------- 7mdhA @2.4 83.2500 | .------------------------------------------------------------ 1smkA @2.5 +----+----+----+----+----+----+----+----+----+----+----+----+ 86.0600 73.4150 60.7700 48.1250 35.4800 22.8350 10.1900 79.7375 67.0925 54.4475 41.8025 29.1575 16.5125 From bopfannkuche at gmx.de Thu Feb 12 12:06:36 2009 From: bopfannkuche at gmx.de (=?ISO-8859-15?Q?Bj=F6rn_Ole_Pfannkuche?=) Date: Thu, 12 Feb 2009 13:06:36 +0100 Subject: [Biojava-l] DNA/DNA Folding Message-ID: <499410CC.8090704@gmx.de> Hi! Has anyone of you writen a programme or classes which aim on folding DNA? Like the projects UNAFold or PairFold, but in Java... Greetz BO From willishf at ufl.edu Thu Feb 12 12:13:57 2009 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 12 Feb 2009 07:13:57 -0500 Subject: [Biojava-l] Multiple Sequence Viewer Message-ID: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> I am getting my feet wet with the various parts and pieces of BioJava and needed a multiple sequence viewer. Searched around in the code and did some google searches and looks like one does not exist but someone faked one with a MultipleLineReader to render each sequence as a label. As part of the learning exercise I cloned SequencePanel to MultipleSequencePanel and SequenceRenderContext to MultipleSequenceRenderContext(to define returning a collection of sequences) and SymbolSequenceRenderer to SymbolMultipleSequenceRenderer. In SymbolMultipleSequenceRenderer I handle the rendering of multiple sequences. I need to add in labels as a non scroll region on the left and then add support for horizontal and vertical scroll. Before I finish the last 10% of what I think need to be done and always takes the longest does a multiple sequence viewer exist in BioJava and I can't find it? Thanks Scooter Willis From russ at kepler-eng.com Thu Feb 12 14:31:39 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 12 Feb 2009 07:31:39 -0700 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: <200902120731.39355.russ@kepler-eng.com> On Thursday 12 February 2009 05:13:57 Scooter Willis wrote: > Before I finish the last 10% of what I think need to be done and always > takes the longest does a multiple sequence viewer exist in BioJava and I > can't find it? In mine I simply have a TranslatedSequencePanel with a MultiLineRenderer, the MLR having a renderer for each sequence or sequence element). This gave me the control over the individual sequence display I needed for cursors and such. From willishf at ufl.edu Thu Feb 12 15:20:55 2009 From: willishf at ufl.edu (Scooter Willis) Date: Thu, 12 Feb 2009 10:20:55 -0500 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <200902120731.39355.russ@kepler-eng.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <200902120731.39355.russ@kepler-eng.com> Message-ID: <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> My plan is to model the multiple sequence viewer used in JALView. Sequence name on the left to fire events when clicked. I also typically work with large sequences so need both horizontal and vertical scroll which does not appear to be in the current viewers. I do see support for viewing a range of a sequence but nothing to scroll or I could be missing something on how the current viewer would scroll not using a ScrollPane. This should be something easy to do in BioJava based on the importance of viewing multiple sequence alignments. Thanks Scooter Willis On Thu, Feb 12, 2009 at 9:31 AM, Russ Kepler wrote: > On Thursday 12 February 2009 05:13:57 Scooter Willis wrote: > > > Before I finish the last 10% of what I think need to be done and always > > takes the longest does a multiple sequence viewer exist in BioJava and I > > can't find it? > > In mine I simply have a TranslatedSequencePanel with a MultiLineRenderer, > the > MLR having a renderer for each sequence or sequence element). This gave me > the control over the individual sequence display I needed for cursors and > such. > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From russ at kepler-eng.com Thu Feb 12 15:55:19 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 12 Feb 2009 08:55:19 -0700 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <200902120731.39355.russ@kepler-eng.com> <7ceb4beb0902120720y7d9dad85sb232575e9f3b4099@mail.gmail.com> Message-ID: <200902120855.20480.russ@kepler-eng.com> On Thursday 12 February 2009 08:20:55 Scooter Willis wrote: > My plan is to model the multiple sequence viewer used in JALView. Sequence > name on the left to fire events when clicked. I also typically work with > large sequences so need both horizontal and vertical scroll which does not > appear to be in the current viewers. I do see support for viewing a range > of a sequence but nothing to scroll or I could be missing something on how > the current viewer would scroll not using a ScrollPane. I guess I don't see the problem with using a ScrollPane, it works well in the app I've written. > This should be something easy to do in BioJava based on the importance of > viewing multiple sequence alignments. It not all *that* hard, in my case it's a ScrollPane with a TranslatedSequencePanel with a MultiLineRenderer with a bunch of GappedRenderers containing individual sequence or other renderers. The TSP holds the alignment and feeds the individual renderers with the appropriate sequence. Somewhere around here I have something I scratched up in BioJava 1.4 that worked as a Clustal alignment viewer, I'll see if I can find it and pop it your way. From hlapp at gmx.net Fri Feb 13 16:53:53 2009 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Feb 2009 11:53:53 -0500 Subject: [Biojava-l] Google Summer of Code: Call for Bio* Volunteers Message-ID: <1F570555-12DF-42DF-8D0E-95AAE298D76A@gmx.net> Google is committed to run the Summer of Code program [1] again this year. It will be for the 5th time. In broad strokes, the program funds what you might call remote summer internships for students to contribute to an open-source software project. Participating projects (or umbrella organizations) provide project ideas and supply mentors that guide the work on those. Students apply to a project within the program with specific project ideas, based on those suggested or based on their own idea, get ranked by the mentors of the project, and those accepted into the program get paired up with mentors. Projects are chiefly about programming, the coding period is 3 months (Jun-Aug), and there is no travel required by either student or mentor. The program is global; other than the US trade restrictions that Google is under, there are no restrictions as to where student or mentor reside. The main motivations behind the program are to recruit new contributors to open-source projects, and to produce more open-source code. See the program FAQs [2] for more information. I've had the honor of being part of the program for the last two years, administering NESCent's participation as an organization [3] and in 2007 mentoring a student. I have to say I find it the most awesome open-source program since sliced bread (or the invention of BLAST if that means more to you). Despite that and sadly enough, there has been a dearth of participating bioinformatics projects (though some notable ones, such as CytoScape have participated). There have been two Bio* Summer of Code projects under the NESCent umbrella, one in 2007 [4] and one in 2008 [5]. I would be willing to volunteer to take the lead on and administer a full-blown participation of O|B|F as a Bio* umbrella organization, provided 1) at least one Bio* person volunteers to serve as backup administrator, and 2) enough Bio* contributors volunteer to serve as prospective mentors. Mentoring involves participating in creating the page of project ideas (I'd provide template and guidance), corresponding with applicants who have questions, participating in student application ranking, and for primary mentors (those directly assigned to a student) based on empirical evidence at least 5hrs/week of time spent with the student to help him/her get over obstacles or avoid wrong paths. I think almost all mentors would concur that the experience was very gratifying, but as a mentor you will be spending a non-negligible amount of time with the student. I think it is the student-mentor pairing and interaction, not the stipend, that in the end makes the participation for students uniquely productive in terms of learning, and different from simply contributing to the project of choice (which they could always do). For a personal impression for how the program is from a mentor perspective, I'll let Chris Fields speak who was the mentor for the 2008 phyloXML in BioPerl project. From a student's perspective, I'll leave it to the 2007 Biojava student Bohyun Lee (blee34-at- mail.gatech.edu) and the 2008 BioPerl student Mira Han (mirhan-at- indiana.edu) to comment (if they are still on the list). So if you think this is a good idea for Bio* to be part of, if you would like to help in some way, if you can see yourself as a mentor, or if you are a lurking would-be student, please let yourself be heard. Email either to the list or to me. Cheers, -hilmar [1] http://code.google.com/soc/2008 [2] http://code.google.com/opensource/gsoc/2009/faqs.html [3] http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2007 http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2008 [4] http://biojava.org/wiki/BioJava:PhyloSOC07 [5] http://bioperl.org/wiki/PhyloXML_support_in_BioPerl -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jeedward at yahoo.com Sat Feb 14 00:23:48 2009 From: jeedward at yahoo.com (John Edward) Date: Fri, 13 Feb 2009 16:23:48 -0800 (PST) Subject: [Biojava-l] Draft paper submission deadline extended: BCBGC-09 Message-ID: <421505.231.qm@web45914.mail.sp1.yahoo.com> Draft paper submission deadline extended: BCBGC-09 ? The deadline for draft paper submission at the 2009 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-09) (website: http://www.PromoteResearch.org ) is extended due to numerous requests from the authors. The conference will be held during July 13-16 2009 in Orlando, FL, USA. We invite draft paper submissions. The conference will take place at the same time and venue where several other international conferences are taking place. The other conferences include: ????????? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-09) ????????? International Conference on Automation, Robotics and Control Systems (ARCS-09) ????????? International Conference on Enterprise Information Systems and Web Technologies (EISWT-09) ????????? International Conference on High Performance Computing, Networking and Communication Systems (HPCNCS-09) ????????? International Conference on Information Security and Privacy (ISP-09) ????????? International Conference on Recent Advances in Information Technology and Applications (RAITA-09) ????????? International Conference on Software Engineering Theory and Practice (SETP-09) ????????? International Conference on Theory and Applications of Computational Science (TACS-09) ????????? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-09) ? The website http://www.PromoteResearch.org contains more details. ? Sincerely John Edward Publicity committee ? ? From anantpossible at gmail.com Sat Feb 14 05:33:25 2009 From: anantpossible at gmail.com (Anant Jain) Date: Sat, 14 Feb 2009 11:03:25 +0530 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: Good Morning, I am student of B.Tech Bioinformatics (4th year) & learning Biojava from Internet. Can you sent me some important pdf documents on BioJava. Thank You ANANT JAIN DYPBBI, PUNE, INDIA From holland at eaglegenomics.com Sat Feb 14 06:39:40 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Sat, 14 Feb 2009 06:39:40 +0000 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> Message-ID: <4996672C.2050907@eaglegenomics.com> Everything you need to know to get started is on our website: http://www.biojava.org/ thanks, Richard Anant Jain wrote: > Good Morning, > I am student of B.Tech Bioinformatics (4th year) & learning Biojava from > Internet. Can you sent me some important pdf documents on BioJava. > Thank You > ANANT JAIN > DYPBBI, PUNE, INDIA > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From markjschreiber at gmail.com Sat Feb 14 07:17:48 2009 From: markjschreiber at gmail.com (Mark Schreiber) Date: Sat, 14 Feb 2009 15:17:48 +0800 Subject: [Biojava-l] Multiple Sequence Viewer In-Reply-To: <4996672C.2050907@eaglegenomics.com> References: <7ceb4beb0902120413k2b380e84r10f4eae613971837@mail.gmail.com> <4996672C.2050907@eaglegenomics.com> Message-ID: <93b45ca50902132317s34fbbab6pacd9f68080e10940@mail.gmail.com> Also look at the recent publication (http://dx.doi.org/10.1093/bioinformatics/btn397) - Mark On Sat, Feb 14, 2009 at 2:39 PM, Richard Holland wrote: > > Everything you need to know to get started is on our website: > > http://www.biojava.org/ > > thanks, > Richard > > Anant Jain wrote: > > Good Morning, > > I am student of B.Tech Bioinformatics (4th year) & learning Biojava from > > Internet. Can you sent me some important pdf documents on BioJava. > > Thank You > > ANANT JAIN > > DYPBBI, PUNE, INDIA > > _______________________________________________ > > Biojava-l mailing list - Biojava-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > -- > Richard Holland, BSc MBCS > Finance Director, Eagle Genomics Ltd > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From anantpossible at gmail.com Tue Feb 17 05:55:00 2009 From: anantpossible at gmail.com (Anant Jain) Date: Tue, 17 Feb 2009 11:25:00 +0530 Subject: [Biojava-l] A little problem Message-ID: Good Morning, i want to retrieve dna sequnce from a GenBank file. So, I am using SeqIOTools.redGenbank(br) method. There is one flaw, example of tutorials says that i will return a sequence but its returning SequenceIterator. That's not a problem, we can get the sequence using nextSequence() method. Now, i want to print the sequence which i have got from genbank file,so i used a for loop (below) for (int pos = 1;pos References: Message-ID: <499A7C65.5040401@eaglegenomics.com> To convert the sequence into a String, use the seqString() method of the Sequence object. Also you should take a look at the replacement for SeqIOTools - RichSequence.IOTools - as this is more up-to-date and handles file parsing more sensibly. cheers, Richard Anant Jain wrote: > > Good Morning, > i want to retrieve dna sequnce from a GenBank file. So, I am using > SeqIOTools.redGenbank(br) method. There is one flaw, example of > tutorials says that i will return a sequence but its returning > SequenceIterator. > That's not a problem, we can get the sequence using nextSequence() method. > > Now, i want to print the sequence which i have got from genbank file,so > i used a for loop (below) > > for (int pos = 1;pos { > > System.out.println(seq.symbolAt(pos)); > > } > > > > Problem 1: It prints the nucleotides in random order, means i want it > from begining to last as i did in loop > plz help me out, because i want to write the sequence in file > > > > Thank You > Anant Jain > PUNE, INDIA -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From crackeur at comcast.net Thu Feb 19 03:25:13 2009 From: crackeur at comcast.net (crackeur at comcast.net) Date: Thu, 19 Feb 2009 03:25:13 +0000 (UTC) Subject: [Biojava-l] [ANN]VTD-XML 2.5 In-Reply-To: <499A7C65.5040401@eaglegenomics.com> Message-ID: <2083482873.1632131235013913327.JavaMail.root@sz0167a.emeryville.ca.mail.comcast.net> VTD-XML 2.5 is now released. Please go to https://sourceforge.net/project/showfiles.php?group_id=110612&package_id=120172&release_id=661376 ?to download the latest version. Changes from Version 2.4 (2/2009) * Added separate VTD indexing generating and loading (see http://vtd-xml.sf.net/persistence.html for further info) * Integrated extended VTD supporting 256 GB doc (In Java only). * Added duplicateNav() for replicate multiple VTDNav instances sharing XML, VTD and LC buffer (availabe in Java and C#). * Various bug fixes and enhancements From aumanga at biggjapan.com Thu Feb 19 02:50:01 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Thu, 19 Feb 2009 11:50:01 +0900 Subject: [Biojava-l] Saving back chromatogram? Message-ID: <499CC8D9.4090906@biggjapan.com> Greetings all, I was able to display chromatogram using ABIFChromatogram class . Now what I want to implement is a chromatogram Editor,where user can edit base calls and save aback to AB1 files. Is there a way to do this in BioJava? Best Regards, umanga From ayates at ebi.ac.uk Thu Feb 19 09:23:00 2009 From: ayates at ebi.ac.uk (Andy Yates) Date: Thu, 19 Feb 2009 09:23:00 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499CC8D9.4090906@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> Message-ID: <499D24F4.8040607@ebi.ac.uk> Hi Umanga, Unfortunately there is no write feature available in the BioJava API. My advice would be to store these new basecalls in a separate file & look into using the Staden IO package which does support write functions (not sure if it will write into AB1 but it will do SCF & ZTR). Sadly this is written in C so you will have to write either some glue code with JNI to get it working or write a small C program to munge an AB1 trace & your new file together. Sorry I can't be of any more help, Andy Ashika Umanga Umagiliya wrote: > Greetings all, > > I was able to display chromatogram using ABIFChromatogram class . > Now what I want to implement is a chromatogram Editor,where user can > edit base calls and save aback to AB1 files. > Is there a way to do this in BioJava? > > Best Regards, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From holland at eaglegenomics.com Thu Feb 19 08:23:31 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Thu, 19 Feb 2009 08:23:31 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499CC8D9.4090906@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> Message-ID: <499D1703.4020803@eaglegenomics.com> No, BioJava does not include the ability to write ABI files. Technically you could write your own code to do it though because the file format is fully understood by the parser and is formally described by a paper linked to from the Javadoc for ABIFChromatogram. By combining the information from the paper with the code from the parser, it should be possible to create a writer. Richard Ashika Umanga Umagiliya wrote: > Greetings all, > > I was able to display chromatogram using ABIFChromatogram class . > Now what I want to implement is a chromatogram Editor,where user can > edit base calls and save aback to AB1 files. > Is there a way to do this in BioJava? > > Best Regards, > umanga > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From anantpossible at gmail.com Fri Feb 20 05:40:17 2009 From: anantpossible at gmail.com (Anant Jain) Date: Fri, 20 Feb 2009 11:10:17 +0530 Subject: [Biojava-l] Regarding BioJava Message-ID: Greetings all, Do we have any BioJava runtime eviorment like jre coz if i give a s/w to anybody then he also include all the jar files in his class path. If i am compile and run my biojava program from my editplus then its working but if i run it from command prompt then its giving error like unable to load PDBFilereader.class. plz tel me how to run our program through cmd From aumanga at biggjapan.com Fri Feb 20 05:48:45 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Fri, 20 Feb 2009 14:48:45 +0900 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499D24F4.8040607@ebi.ac.uk> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> Message-ID: <499E443D.30206@biggjapan.com> Greetings all, Thank you for all your answers. What I want to do is ,after modifying callbases, I need to use the sequences with 'phrad' foro DNA assembly.For 'phrad', I need to give Fasta files and Quality files. I can modify callbases using my chromatogram editor , and save the new sequence in Fasta file.But my problem is , If I change the original callbases from AB1 file ,does it effect the Qualitiy files also? Or can I use the original Quality files generated by 'phred' with the new callbases. Here is a image demonstrating my scenario: http://img3.imageshack.us/img3/3564/f92df0815fab523f3c72aa3qx7.png Step (2) Ab1 files are passed into 'phred' and Fasta,Quality files are generated. Step (3) If user want to edit callbases , he can use the chromatogram editor.Then the Fasta files generated by 'phred' are replaced with new onces generated by editor. My problem is can I use the same Quality files ,with the modified callbases ? Sorry if this is off the topic. Thanks in advance, Umanga Andy Yates wrote: > Hi Umanga, > > Unfortunately there is no write feature available in the BioJava API. My > advice would be to store these new basecalls in a separate file & look > into using the Staden IO package which does support write functions (not > sure if it will write into AB1 but it will do SCF & ZTR). Sadly this is > written in C so you will have to write either some glue code with JNI to > get it working or write a small C program to munge an AB1 trace & your > new file together. > > Sorry I can't be of any more help, > > Andy > > Ashika Umanga Umagiliya wrote: > >> Greetings all, >> >> I was able to display chromatogram using ABIFChromatogram class . >> Now what I want to implement is a chromatogram Editor,where user can >> edit base calls and save aback to AB1 files. >> Is there a way to do this in BioJava? >> >> Best Regards, >> umanga >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> > > From russ at kepler-eng.com Fri Feb 20 06:16:46 2009 From: russ at kepler-eng.com (Russ Kepler) Date: Thu, 19 Feb 2009 23:16:46 -0700 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499E443D.30206@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> Message-ID: <200902192316.46735.russ@kepler-eng.com> On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: > My problem is can I use the same Quality files ,with the modified > callbases ? Edit the quality data in parallel with the chromatogram. I would assume that the editor is relatively sure of their edits, so I would give them a high confidence level when I generate the trace fasta and quality file. From aumanga at biggjapan.com Fri Feb 20 06:54:15 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Fri, 20 Feb 2009 15:54:15 +0900 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <200902192316.46735.russ@kepler-eng.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> <200902192316.46735.russ@kepler-eng.com> Message-ID: <499E5397.4060502@biggjapan.com> Cheers. I still haven't implemented the editor.I assume for basecall editing, I can use 'Edit' class. Like: ABIFChromatogram a; .. .. a.getBaseCalls().edit(new Edit(.......)) And for Quality values, I am hoping to read and store Qualitly files generated by 'phred'.And when user edit the basecall, I am planing to edit stored quality values accordinly. Finally the fasta file is generated using ABIFChromatogram.getBaseCalls()... and Quality file will be generated using the structure I used above. Any issues with this approach? Many thanks, Umanga Russ Kepler wrote: > On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: > > >> My problem is can I use the same Quality files ,with the modified >> callbases ? >> > > Edit the quality data in parallel with the chromatogram. I would assume that > the editor is relatively sure of their edits, so I would give them a high > confidence level when I generate the trace fasta and quality file. > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > > From holland at eaglegenomics.com Fri Feb 20 08:43:23 2009 From: holland at eaglegenomics.com (Richard Holland) Date: Fri, 20 Feb 2009 08:43:23 +0000 Subject: [Biojava-l] Regarding BioJava In-Reply-To: References: Message-ID: <499E6D2B.90904@eaglegenomics.com> Like any other Java software, you need to have all the BioJava jars on your classpath to run it from the command line. If you want to avoid having to do that, you can copy the jars into $JAVA_HOME/ext. Please read Sun's documentation on the Java programming language to learn more about the classpath. Richard Anant Jain wrote: > Greetings all, > > Do we have any BioJava runtime eviorment like jre coz > if i give a s/w to anybody then he also include all the jar files in his > class path. > > If i am compile and run my biojava program from my editplus then its working > but if i run it from command prompt then its giving error like unable to > load PDBFilereader.class. plz tel me how to run our program through cmd > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com http://www.eaglegenomics.com/ From ayates at ebi.ac.uk Fri Feb 20 11:02:52 2009 From: ayates at ebi.ac.uk (Andy Yates) Date: Fri, 20 Feb 2009 11:02:52 +0000 Subject: [Biojava-l] Saving back chromatogram? In-Reply-To: <499E5397.4060502@biggjapan.com> References: <499CC8D9.4090906@biggjapan.com> <499D24F4.8040607@ebi.ac.uk> <499E443D.30206@biggjapan.com> <200902192316.46735.russ@kepler-eng.com> <499E5397.4060502@biggjapan.com> Message-ID: <499E8DDC.7090100@ebi.ac.uk> As far as I'm aware no there shouldn't be a problem with this solution. However have you investigated any other mechanisms for doing this kind of contig editing/assembly work? I know that people at the Sanger Centre use a program called gap4 which has powered just about every assembly they have done. It's available from: http://www.sanger.ac.uk/Software/production/staden/ (bit of info here) http://staden.sourceforge.net/staden_home.html Have a look at it as I feel that this is exactly what you are attempting to do. Andy Ashika Umanga Umagiliya wrote: > Cheers. > > > I still haven't implemented the editor.I assume for basecall editing, I > can use 'Edit' class. > > Like: > > ABIFChromatogram a; > .. > .. > a.getBaseCalls().edit(new Edit(.......)) > > > And for Quality values, I am hoping to read and store Qualitly files > generated by 'phred'.And when user edit the basecall, I am planing to > edit stored quality values accordinly. > Finally the fasta file is generated using > ABIFChromatogram.getBaseCalls()... > and Quality file will be generated using the structure I used above. > > Any issues with this approach? > > Many thanks, > Umanga > > > Russ Kepler wrote: >> On Thursday 19 February 2009 22:48:45 Ashika Umanga Umagiliya wrote: >> >> >>> My problem is can I use the same Quality files ,with the modified >>> callbases ? >>> >> >> Edit the quality data in parallel with the chromatogram. I would >> assume that the editor is relatively sure of their edits, so I would >> give them a high confidence level when I generate the trace fasta and >> quality file. >> _______________________________________________ >> Biojava-l mailing list - Biojava-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biojava-l >> >> > > _______________________________________________ > Biojava-l mailing list - Biojava-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biojava-l From jeedward at yahoo.com Fri Feb 20 15:27:09 2009 From: jeedward at yahoo.com (John Edward) Date: Fri, 20 Feb 2009 07:27:09 -0800 (PST) Subject: [Biojava-l] Draft paper submission deadline extended: BCBGC-09 Message-ID: <483752.12181.qm@web45903.mail.sp1.yahoo.com> Draft paper submission deadline extended: BCBGC-09 ? The deadline for draft paper submission at the 2009 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-09) (website: http://www.PromoteResearch.org ) is extended due to numerous requests from the authors. The conference will be held during July 13-16 2009 in Orlando, FL, USA. We invite draft paper submissions. The conference will take place at the same time and venue where several other international conferences are taking place. The other conferences include: ????????? International Conference on Artificial Intelligence and Pattern Recognition (AIPR-09) ????????? International Conference on Automation, Robotics and Control Systems (ARCS-09) ????????? International Conference on Enterprise Information Systems and Web Technologies (EISWT-09) ????????? International Conference on High Performance Computing, Networking and Communication Systems (HPCNCS-09) ????????? International Conference on Information Security and Privacy (ISP-09) ????????? International Conference on Recent Advances in Information Technology and Applications (RAITA-09) ????????? International Conference on Software Engineering Theory and Practice (SETP-09) ????????? International Conference on Theory and Applications of Computational Science (TACS-09) ????????? International Conference on Theoretical and Mathematical Foundations of Computer Science (TMFCS-09) ? The website http://www.PromoteResearch.org contains more details. ? Sincerely John Edward Publicity committee From anantpossible at gmail.com Mon Feb 23 09:45:45 2009 From: anantpossible at gmail.com (Anant Jain) Date: Mon, 23 Feb 2009 15:15:45 +0530 Subject: [Biojava-l] problem in alignment Message-ID: Greeetings all, i am aligning two protein sequences using BLOSUM62 substitution matrix but an runtime error is coming which is " this tokenization does not contain character ' * ' " some IllegalSymbolException please suggest me solution or any substitution matrix or do i have to change my sequence format. plz rply From andreas.draeger at uni-tuebingen.de Mon Feb 23 10:48:20 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Mon, 23 Feb 2009 11:48:20 +0100 Subject: [Biojava-l] problem in alignment In-Reply-To: References: Message-ID: <49A27EF4.1050109@uni-tuebingen.de> Dear Anant Jain, > i am aligning two protein sequences using BLOSUM62 > substitution matrix but an runtime error is coming which is " this > tokenization does not contain character ' * ' " some IllegalSymbolException > > please suggest me solution or any substitution matrix > > or do i have to change my sequence format. This problem occurs very frequently because in BioJava the * symbol (the termination symbol) belongs to the alphabet PROTEIN_TERM and not to the alphabet PROTEIN. Please use the correct alphabet and you'll be fine. Cheers Andreas From andreas.draeger at uni-tuebingen.de Mon Feb 23 20:48:54 2009 From: andreas.draeger at uni-tuebingen.de (Andreas =?iso-8859-1?b?RHLkZ2Vy?=) Date: Mon, 23 Feb 2009 21:48:54 +0100 Subject: [Biojava-l] problem in alignment In-Reply-To: References: <49A27EF4.1050109@uni-tuebingen.de> Message-ID: <20090223214854.959059s53gtvecvq@webmail.uni-tuebingen.de> Dear Anant Jain, Yeah, the problem is that the substitution matrices, e.g., BLOSUM 50, contain the *-symbol. So you'll definitely need the PROTEIN_TERM alphabet when doing protein alignments. Just try and let me know if it works. Cheers Andreas Dr?ger Dipl.-Bioinform. Andreas Dr?ger Eberhard Karls University T?bingen Center for Bioinformatics (ZBIT) Sand 1 72076 T?bingen Germany Phone: +49-7071-29-70436 Fax: +49-7071-29-5091 From andreas.draeger at uni-tuebingen.de Tue Feb 24 07:34:28 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Tue, 24 Feb 2009 08:34:28 +0100 Subject: [Biojava-l] Regarding problen in Alignment In-Reply-To: References: Message-ID: <49A3A304.4010004@uni-tuebingen.de> Dear Anant Jain, Sorry for this misunderstanding. There is no substitution matrix "PROTEIN_TERM". The constructor of the BioJava SubstitutionMatrix class requires the following arguments: FiniteAlphabet alpha and File matrixFile. As the alphabet you should use the alphabet "PROTEIN_TERM" for matrixes of protein sequences like BLOSUM etc. In the BioJava repository there is a newer version of this class that provides a function to parse a matrix without explicitely providing an alphabet (it guesses the alphabet of the matrix). I hope this helps. Cheers Andreas Dr?ger Anant Jain wrote: > Thank You Sir, > I have searched through ftp/...blast/matrices, > but did not got any file substitution matrix like PROTEIN_TERM, if u > have can u send me it OR tell me link from where i can download the it > > Thank You > Anant jain > DYPBI > Pune, India. > From sauloal at gmail.com Tue Feb 24 15:53:10 2009 From: sauloal at gmail.com (Saulo Alves) Date: Tue, 24 Feb 2009 16:53:10 +0100 Subject: [Biojava-l] Sequence Annotation on Gui Message-ID: Hello, I'm new here and in biojava and i have been strugling with this problem. I have a chromossome with its gene annotations and i want to plot it in a GUI. I used the tutorial to make the basic setup and i can have a good picture of the chromossome and its genes position. The problem is: how to plot the name of each gene along the arrow which represents each gene? thanks in advance, S. From andreas.draeger at uni-tuebingen.de Wed Feb 25 06:54:29 2009 From: andreas.draeger at uni-tuebingen.de (=?ISO-8859-1?Q?Andreas_Dr=E4ger?=) Date: Wed, 25 Feb 2009 07:54:29 +0100 Subject: [Biojava-l] Regarding problen in Alignment In-Reply-To: References: <49A3A304.4010004@uni-tuebingen.de> Message-ID: <49A4EB25.5070207@uni-tuebingen.de> Anant Jain wrote: > Thank Sir, > > So should i use this > > SubstituionMatrix matrix = new SubstitutionMatrix("PROTEIN_TERM",new > File (BLOSUM62.50")); > > > Anant Jain > DYPBBI, > Pune,India Dear Anant Jain, "PROTEIN_TERM" is a String and not a FiniteAlphabet. Please have a look at the following example, where the alphabet "DNA" is used. Just replace "DNA" by "PROTEIN_TERM" and it will work for you: http://biojava.org/wiki/BioJava:CookBook:DP:PairWise2 Cheers Andreas From aumanga at biggjapan.com Wed Feb 25 07:44:24 2009 From: aumanga at biggjapan.com (Ashika Umanga Umagiliya) Date: Wed, 25 Feb 2009 16:44:24 +0900 Subject: [Biojava-l] Biojava Parsers : Apply quality values for contig ? Message-ID: <49A4F6D8.9010307@biggjapan.com> Greetings all, I am using 'phred/phrap' to assemble DNA sequences ,and 'phrap' generates contig file and a contig-quality files for an assembly. Now I want to parse these two files and generate final contig , by removing Bases with '0' quality values. For example : CGACTATG + 0 42 54 59 48 0 0 0 > _GACT____ Why I want to do this is; because only this "masking" will give the similar contig that of which generated by ChromasPro. I can use Fasta-parser to parse contig file.But I wonder whether theres anyway to handle parsing of Quality file in BioJava. Below I have give the structures of two file types: thanks in advance, Umanga contig file: ------------ >seqs_fasta.Contig1 TTGGAGAGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACA CATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGCTTTGCTGACGAGT GGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAAC TACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGG GACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTA GGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGA TGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCA GCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGC GTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAG GGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCAC CGGCTAATTCCGTGCCAGCAGCCGCGGTAATATTNTTATTCTTTATGTAT ACATATTCTTTTTACTTTATTCTATTAAATTTATTCTTTCATAATTAAAC CTTCCCTTACACCCATTCCACCTCCCATCCCTCTTCCCCTCCCACTCTCC ATCTCATATGGCGTTCGCGCCTCTCTCTTCATCTCCTCCTATATTTATTC TAACTTCTTTCATCTCAATCATTTCTTCTGTCTCATCCTTCCATTCTTTC CATGATCTCCCCCATTGTCATGTCTTCAAAAAACCACACAAAACACTAGA ATCTTTTCTTATTACACACAAGTATATACAATTTTTAACAATCCATTAAA ACACACACAACACCTAGCAATCAACAACGCTACCATCCCCAATATTCTCT GTTCTCCTCTCTTTCTCCGCGTGCATCTGCGCACTACTCTCTAATTTCAT CTCTATTATCTTTTTTTCTTAACTCATCCGCATACATCCAAGACTCTAGA CCCATTTCTCGCCTCTTTCATTTACTGCCGATACAGAGCTTATAAATTCT ATATCATTTATCCACACTCATTATTAAATAGGCTGACACCTCTAACCGTC CACTACACCACCTTTCCCATGCCATCTCCCTAACACTGCACTCATCCGTA ACTTCCTACTCTACCCTCTCTTTCTTTCCTTACTTTCTTTTCTTTCTCTT ACATTTTTATTTAAAATTCCTCTTTTAGCCTCTATTTTCTGTTATCTACT TTTCTCCTAAATTCCCCCTATTCTTCACGTCCCATACCTATCCCTACCAC CACCACTACCACCCCTCTCTTCATTCTACTCGCTCTAAACCCTCCACCCT CCCCTCCTTGCTCTTATGTATCTCCTCATCTTTTAAT quality file ------------ >seqs_fasta.Contig1 0 23 23 33 33 33 33 33 31 41 47 47 47 47 47 47 47 50 47 47 57 59 59 59 42 42 35 42 42 54 59 48 48 48 48 48 48 54 57 57 57 57 57 54 54 57 54 54 54 74 74 74 74 59 57 57 57 57 72 72 84 76 73 72 72 72 79 81 74 74 62 50 50 50 59 39 43 32 35 32 43 58 44 48 70 70 58 73 55 69 67 87 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 90 90 90 90 90 90 90 85 87 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 77 77 77 81 81 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 74 74 85 90 90 90 90 90 90 90 90 90 90 90 90 83 83 90 90 90 90 90 90 90 75 83 83 89 89 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 72 72 72 57 57 43 37 37 43 72 72 72 72 72 72 90 90 90 90 90 90 90 86 90 90 90 90 90 90 90 79 85 83 90 90 90 89 87 87 90 90 90 90 67 67 79 78 90 86 88 82 73 68 65 61 59 63 62 68 71 72 59 56 41 35 30 30 28 32 41 47 40 56 49 42 49 51 50 37 37 39 39 37 52 54 51 46 20 20 27 24 32 24 20 20 21 24 16 19 19 33 29 22 23 12 11 11 12 20 23 40 32 31 28 22 13 13 18 26 28 28 34 28 25 24 28 23 26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -- ??? ???? ????? ???????????????????BiGG) ?140-0001 ?????????3-6-9 ??????8F TEL:03-6679-8763 FAX:03-6679-8764 From watson at ebi.ac.uk Wed Feb 25 11:07:49 2009 From: watson at ebi.ac.uk (James Watson) Date: Wed, 25 Feb 2009 11:07:49 +0000 Subject: [Biojava-l] Java programmatic access course at the EMBL-EBI Message-ID: <49A52685.7090201@ebi.ac.uk> We have a hands-on training course that might be of interest being run here at the European Bioinformatics Institute. Further details can be found at http://www.ebi.ac.uk/training/handson/ Title: Programmatic Access: to biological databases (Java) Date: 27-29 April 2009 Venue: EMBL-EBI, Hinxton, Nr Cambridge, CB10 1SD, UK Organisers: Samuel Patient & James Watson Registration Deadline: 30th March 2009 - 12 noon (GMT) Cost: ?50.00 (no travel or accomodation included) James Watson -- James D Watson Scientific Training Officer EMBL-EBI Wellcome Trust Genome Campus Hinxton Tel: +44(0)1223 492541 http://www.ebi.ac.uk/training/ Upcoming hands on training courses (http://www.ebi.ac.uk/training/handson/): 16-18 March 2009: Sequence to Genes - Genome Informatics 27-29 April 2009: Programmatic access to biological databases 11-15 May 2009: A Walkthrough EBI Bioinformatics Resources From koen.bruynseels at cropdesign.com Wed Feb 25 11:18:32 2009 From: koen.bruynseels at cropdesign.com (koen.bruynseels at cropdesign.com) Date: Wed, 25 Feb 2009 12:18:32 +0100 Subject: [Biojava-l] Koen Bruynseels is out of the office. Message-ID: I will be out of the office starting 02/24/2009 and will not return until 03/02/2009. I will respond to your message when I return. From andreas at sdsc.edu Wed Feb 25 17:54:06 2009 From: andreas at sdsc.edu (Andreas Prlic) Date: Wed, 25 Feb 2009 09:54:06 -0800 Subject: [Biojava-l] biojava 1.7 release schedule Message-ID: <59a41c430902250954k4696f868hb5b8b50fa247a03a@mail.gmail.com> Hi I would like to propose the following release plan for biojava 1.7: * the next couple of weeks: commit missing patches, write junit tests and improve documentation (everybody with write access) * Wed. April 8th: code freeze (declared on biojava-dev by me) final checks. at this point all unit tests should pass without problems * Sat. April 11th : I will branch the svn, copy the release files on the biojava site, and write the announce email Andreas From andreas.prlic at gmail.com Thu Feb 26 16:16:56 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Thu, 26 Feb 2009 08:16:56 -0800 Subject: [Biojava-l] BioJava and backbone amino acids In-Reply-To: <49A63722.40002@wp.pl> References: <49A63722.40002@wp.pl> Message-ID: <59a41c430902260816t1280caa1tb770a174d32fcc03@mail.gmail.com> Hi Michal, You could use the Calc class to calculate all the distances of Atoms that are in proximity of a the ligand. Andreas On Wed, Feb 25, 2009 at 10:30 PM, Michal Lorenc wrote: > Dear Andreas, > do you know how it is possible to find backbone amino acids around the > ligand with BioJava or do you know another software? > > Thank you in advance. > > Best regards, > > Michal > From bopfannkuche at gmx.de Fri Feb 27 10:36:38 2009 From: bopfannkuche at gmx.de (=?ISO-8859-15?Q?Bj=F6rn_Ole_Pfannkuche?=) Date: Fri, 27 Feb 2009 11:36:38 +0100 Subject: [Biojava-l] Zuker Algorithm Message-ID: <49A7C236.6090807@gmx.de> Hello! I' m reading this mailinglist for quite a while and I often learned very nice tricks on different stuff *G* Now I have a little problem/question of my own: Has anyone implemented the Zuker-Algorithm for RNA Folding so far? I need a Java Version for a software project and due to the fact that all of you are developing software I hope that I do not have to implement it again in the case one of you have already done. If someone can help me I would be very delighted, thanks in advance! Bj?rn From andreas.prlic at gmail.com Sat Feb 28 03:15:44 2009 From: andreas.prlic at gmail.com (Andreas Prlic) Date: Fri, 27 Feb 2009 19:15:44 -0800 Subject: [Biojava-l] BioJava and backbone amino acids In-Reply-To: <49A79EAD.7030000@wp.pl> References: <49A63722.40002@wp.pl> <59a41c430902260816t1280caa1tb770a174d32fcc03@mail.gmail.com> <49A79EAD.7030000@wp.pl> Message-ID: <59a41c430902271915l73a05533lc77d5dc69e439480@mail.gmail.com> Hi Michal, you can get the backbone atoms e.g. by using the StructureTools class: StructureTools.getBackboneAtomArray(Structure s) http://www.biojava.org/docs/api/org/biojava/bio/structure/StructureTools.html Andreas On Fri, Feb 27, 2009 at 12:05 AM, Michal Lorenc wrote: > Hi Andreas, > Thank you for your email. But how can I find backbone amino acids? > > Thank you in advance. > > Best regards, > > Michal > > Andreas Prlic wrote: >> >> Hi Michal, >> >> You could use the Calc class to calculate all the distances of Atoms >> that are in proximity of a the ligand. >> >> Andreas >> >> >> On Wed, Feb 25, 2009 at 10:30 PM, Michal Lorenc wrote: >>> >>> Dear Andreas, >>> do you know how it is possible to find backbone amino acids around the >>> ligand with BioJava or do you know another software? >>> >>> Thank you in advance. >>> >>> Best regards, >>> >>> Michal >>> >> >> >