From ilari.scheinin at helsinki.fi  Thu Nov  3 07:28:49 2005
From: ilari.scheinin at helsinki.fi (Ilari Scheinin)
Date: Thu Nov  3 08:29:48 2005
Subject: [Biojava-l] Dazzle and Ensembl 34
Message-ID: <BBB1FC98-6BEA-4FB4-880F-DAB8A187C7E4@helsinki.fi>

Hello.

I am trying to install the DAS server Dazzle to serve data from a  
local copy of Ensembl, but it is not working. Does Dazzle not support  
the newest releases of Ensembl? I can get it to work with Ensembl 30  
from ensembldb.ensembl.org, but not with the 34 version we have  
mirrored.

I have setup three datasources in dazzlecfg.xml: "local" (local  
Ensembl 34), "new" (34 from ensembl.org), and "old" (30 from  
ensembl.org). The weird part is that datasources "local" and "new"  
give me different error messages even though the only difference in  
the config file is the host name. Getting the entry_points works for  
both of them, and so does getting the features for the first kb of  
chromosome 1:

$ curl http://localhost:8090/das/local/features?segment=1:1,1000
gives 3 features of the type "contig", and so does:
$ curl http://localhost:8090/das/new/features?segment=1:1,1000

But when I try to get features for the first 10 kb, I get two  
different error messages from the local and remote Ensembl 34 databases:

$ curl http://localhost:8090/das/local/features?segment=1:1,10000
...
javax.servlet.ServletException:  
org.biojava.bio.seq.db.ensembl.spi.AdaptorException:  
java.sql.SQLException: null,  message from server: &quot;Unknown  
column 'exon.exon_id' in 'on clause'&quot;
...

$ curl http://localhost:8090/das/new/features?segment=1:1,10000
...
javax.servlet.ServletException:  
org.biojava.bio.seq.db.ensembl.spi.AdaptorException:  
java.sql.SQLException: Base table or view not found,  message from  
server: &quot;Table 'homo_sapiens_core_34_35g.gene_description'  
doesn't exist&quot;
...

The first error message is really weird, because that column really  
does exist:
mysql> desc exon;
+-------------------+------------------+------+-----+--------- 
+----------------+
| Field             | Type             | Null | Key | Default |  
Extra          |
+-------------------+------------------+------+-----+--------- 
+----------------+
| exon_id           | int(10) unsigned | NO   | PRI | NULL    |  
auto_increment |
...

The second error message is caused, because the table  
gene_description is missing. Newest Ensembl releases don't seem to  
have it anymore and the newest one where I could find it was version  
30. That is why I also configured a datasource for that version from  
ensembldb.ensembl.org, which seems to work.

So, is Dazzle not compatible with the newest Ensembl releases? Any  
ideas why I get different error messages from a local mirror and  
ensembldb.ensembl.org?

I'm running J2SE 5.0, Tomcat 5.0.28 and the ensembl-das-webapp-1.4.30  
version of Dazzle.

Thanks,
Ilari


-- 
Ilari Scheinin, BSc.
Biomedicum Bioinformatics Unit
National Public Health Institute
Helsinki, Finland
ilari.scheinin@helsinki.fi
http://www.bioinfo.helsinki.fi/


From mark.schreiber at novartis.com  Fri Nov  4 05:29:00 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Fri Nov  4 05:34:29 2005
Subject: [Biojava-l] BioJavaX ready for testing
Message-ID: <OF47BFDBEF.1EF06871-ON482570AF.00397BF3-482570AF.00399705@EU.novartis.net>

Richard has done a really excellent job of making some pretty 
comprehensive docs here with lots of examples. You should be able to use 
it to take biojavax out for a spin!

- Mark


"Richard HOLLAND" <hollandr@gis.a-star.edu.sg>
Sent by: biojava-l-bounces@portal.open-bio.org
10/31/2005 05:28 PM

 
        To:     <biojava-l@biojava.org>
        cc:     Biosql <biosql-l@open-bio.org>, (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] BioJavaX ready for testing


Hello people!

Mark is away so I'm taking the liberty of sneaking this one out... :)

I've cross-posted this to both BioJava and BioSQL as much of what is new 
in BioJavaX will probably be of interest to BioSQL users too.

We've been doing a lot of work recently on creating some extensions to 
BioJava called BioJavaX. Primarily the purpose of these extensions is to 
provide better interaction with BioSQL databases, which has been achieved 
using Hibernate (www.hibernate.org). You can now fully interact with every 
column of every table in BioSQL, using Hibernate's own HQL language to 
construct queries that result in sets of BioJavaX objects. Selects, 
inserts, updates, primary key assignment, foreign key relations, and 
deletes are all handled transparently by Hibernate, removing the need for 
any SQL at all to be included in BioJavaX.

As a side effect of constructing a Hibernate-compatible extension to the 
BioJava object model, we were required to define objects that hold much 
more detailed information about themselves. For instance, a Sequence 
object cannot tell you what namespace it lives in in the BioSQL database, 
but our extension to it, RichSequence, can. As RichSequence extends 
Sequence and doesn't replace it, this means you can use the new objects 
with your existing code without any hassle casting them.

To be able to load information from files into these new RichSequence 
objects in a meaningful way, we had to create a more detailed 
SeqIOListener, called RichSeqIOListener. Then, we had to create new file 
parsers for the common file formats which were able to extract more 
detailed information than before in order to satisfy the 
RichSeqIOListener. 

It's pretty safe to say that the file parsers in BioJavaX are leagues 
ahead of the existing ones in BioJava, even if I do say so myself. :P The 
downside of this extra detail though is that the parsers are much more 
sensitive and will not play well at all with incomplete or incorrectly 
formed files. If someone can edit them to be less sensitive whilst still 
retaining the level of detail required, that'd be great.

We've included parsers for FASTA, GenBank, EMBL, UniProt, INSDseq, 
EMBLxml, UniProtXML, and an extra one for parsing NCBI Taxonomy data.

Do note that BioJavaX cannot fully convert sequences created using the old 
BioJava model into the new BioJavaX model. It'll do its best, but the 
RichSequence object you'll end up with will have lots of properties set to 
null and a tonne of annotations instead, pretty much the same as the 
original Sequence object I suppose. So its best to try to avoid 
conversions and deal with RichSequence objects from the ground up. This is 
particularly important to consider when converting a BioSQL database 
previously used with BioJava into one for use with BioJavaX. You'll also 
find that if you pass a converted old-style Sequence object to one of the 
new file parsers for writing it may fail or produce output with lots of 
missing fields, as it will not find the information it is looking for in 
the places it expects. 

The whole lot is specifically designed to mimic and be compatible with 
BioSQL, but you don't need to have a BioSQL database to use it. Everything 
is standalone and will work just fine without a backing data source. Also 
there is no reason why you couldn't create a new set of Hibernate mappings 
that map the BioJavaX object model to some other relational database 
schema of your choice.

The upshot of it all is the org.biojavax package, which you can find in 
biojava-live branch on CVS. Development is pretty much complete, and it 
now needs some serious testing.

We need volunteers to:

                 a) test the BioSQL interaction via Hibernate with the 
various database flavours supported (HSQL, Oracle, MySQL, PostGreSQL)
                 b) test the various file formats, particularly looking 
for special-case exceptions which the parsers may not be aware of yet
                 c) do some load-testing and help us find ways to improve 
it if it turns out to be too slow when under pressure

Documentation of the new features can be found in DocBook XML format in 
docs/docbook/BioJavaX.xml in the biojava-live branch of CVS. It's as 
detailed as I could make it without getting bored to death writing it. 
I've never been the world's best documentation writer, so if anyone would 
like to help improve it you're more than welcome.

Our plan is to make all this an official part of BioJava come the 1.5 
release, whenever that may be. For now though it is very very much a 
testing-stage thing, not even an alpha release.

Questions on a postcard to either Mark or myself. Feedback most welcome.

cheers,
Richard


Richard Holland
Bioinformatics Specialist
Genome Institute of Singapore
60 Biopolis Street, #02-01 Genome, Singapore 138672
Tel: (65) 6478 8000   DID: (65) 6478 8199
Email: hollandr@gis.a-star.edu.sg
---------------------------------------------
This email is confidential and may be privileged. If you are not the 
intended recipient, please delete it and notify us immediately. Please do 
not copy or use it for any purpose, or disclose its content to any other 
person. Thank you.
---------------------------------------------


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From andreas.scheucher at embl.de  Thu Nov 10 05:49:15 2005
From: andreas.scheucher at embl.de (Andreas Scheucher)
Date: Thu Nov 10 05:54:28 2005
Subject: [Biojava-l] Extract accession number out of xml blast result
Message-ID: <437325AB.70404@embl.de>

Hi,

I'am parsing a blast result file for an multi fasta search with biojava.
Now I'm wondering, whether there really is no possibility to get the 
accession number out of an blast hit. The xml tag with the information 
is there but where ist the belonging function?

Thanks for your effort.

Regards,
Andreas
From hollandr at gis.a-star.edu.sg  Thu Nov 10 21:15:27 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Thu Nov 10 21:26:35 2005
Subject: [Biojava-l] Extract accession number out of xml blast result
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56026568FE@BIONIC.biopolis.one-north.com>

As documented at BioJava in Anger, the subject's accession can be
obtained from the SeqSimilaritySearchHit using getSubjectID(). 

By reading the API, the query's accession can be obtained from
SeqSimilaritySearchResult using getQuerySequence().getName().

However... unforunately, the query accession method above does not work
if you follow the BioJava in Anger example code!

BlastLikeSearchBuilder requires a SequenceDB and a
SequenceDBInstallation. The former should contain all sequences used in
the query, and the latter should be able to provide SequenceDB instances
corresponding to the databases used in the blast. For instance, if you
blasted query "A12345" vs. database "nr", then the SequenceDB instance
should return a meaningful value for getSequence("A12345"), and the
SequenceDBInstallation instance should return a meaningful value for
getSequenceDB("nr").

The example at BioJava in Anger uses a DummySequenceDB and
DummySequenceDBInstallation to pass to the BlastLikeSearchBuilder. Both
these instances generate the exact same response no matter what values
you pass to getSequence() and getSequenceDB() - they return a Sequence
or SequenceDB with the name of "dummy".

If you are really interested in the actual query accession, you would
need to provide your own SequenceDB which returned appropriately named
sequences. If your queries all come from an existing SequenceDB object,
you can just pass this straight in. Likewise, if you are really
interested in the target database name, you can construct or use some
other SequenceDBInstallation to provide the appropriate SequenceDB
instances.

BUT... you can get round all this object overkill by knowing a few
things about your query data before trying to parse it. First, when you
run BLAST on multiple query sequences in a single input file, the report
generated will contain the query sequences in the same order as the
input file. Second, the SeqSimilaritySearchResult objects are returned
in the same order as the results appear in the BLAST report, and there
will be one SeqSimilaritySearchResult object per query sequence. So, if
you have a list of your query sequence accessions in the order they
appear in the input file to BLAST, you can then maintain a counter which
increments each time you obtain the next SeqSimilaritySearchResult, and
that counter will provide a direct pointer into your list to tell you
which query accession you are currently working with. Likewise, you
should know already what blast database you blasted against, so you
shouldn't really need to get this information from the results.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Andreas Scheucher
> Sent: Thursday, November 10, 2005 6:49 PM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] Extract accession number out of xml blast result
> 
> 
> Hi,
> 
> I'am parsing a blast result file for an multi fasta search 
> with biojava.
> Now I'm wondering, whether there really is no possibility to get the 
> accession number out of an blast hit. The xml tag with the 
> information 
> is there but where ist the belonging function?
> 
> Thanks for your effort.
> 
> Regards,
> Andreas
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From mark.schreiber at novartis.com  Fri Nov 11 00:19:32 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Fri Nov 11 00:18:00 2005
Subject: [Biojava-l] Extract accession number out of xml blast result
Message-ID: <OFE7AEE1BB.3EECAC4B-ON482570B6.001D053E-482570B6.001D4247@EU.novartis.net>

Another way to parse the results without using the full blown object model 
and SequenceDB is to extend SearchContentAdapter and listen for the events 
of interest. The event that gives you the query id is a callback to 
setQueryID(String id) on the adapter.

Take a look at http://www.biojava.org/docs/bj_in_anger/blastecho.htm for 
some hints.

- Mark


"Richard HOLLAND" <hollandr@gis.a-star.edu.sg>
Sent by: biojava-l-bounces@portal.open-bio.org
11/11/2005 10:15 AM

 
        To:     "Andreas Scheucher" <andreas.scheucher@embl.de>, <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        RE: [Biojava-l] Extract accession number out of xml blast result


As documented at BioJava in Anger, the subject's accession can be
obtained from the SeqSimilaritySearchHit using getSubjectID(). 

By reading the API, the query's accession can be obtained from
SeqSimilaritySearchResult using getQuerySequence().getName().

However... unforunately, the query accession method above does not work
if you follow the BioJava in Anger example code!

BlastLikeSearchBuilder requires a SequenceDB and a
SequenceDBInstallation. The former should contain all sequences used in
the query, and the latter should be able to provide SequenceDB instances
corresponding to the databases used in the blast. For instance, if you
blasted query "A12345" vs. database "nr", then the SequenceDB instance
should return a meaningful value for getSequence("A12345"), and the
SequenceDBInstallation instance should return a meaningful value for
getSequenceDB("nr").

The example at BioJava in Anger uses a DummySequenceDB and
DummySequenceDBInstallation to pass to the BlastLikeSearchBuilder. Both
these instances generate the exact same response no matter what values
you pass to getSequence() and getSequenceDB() - they return a Sequence
or SequenceDB with the name of "dummy".

If you are really interested in the actual query accession, you would
need to provide your own SequenceDB which returned appropriately named
sequences. If your queries all come from an existing SequenceDB object,
you can just pass this straight in. Likewise, if you are really
interested in the target database name, you can construct or use some
other SequenceDBInstallation to provide the appropriate SequenceDB
instances.

BUT... you can get round all this object overkill by knowing a few
things about your query data before trying to parse it. First, when you
run BLAST on multiple query sequences in a single input file, the report
generated will contain the query sequences in the same order as the
input file. Second, the SeqSimilaritySearchResult objects are returned
in the same order as the results appear in the BLAST report, and there
will be one SeqSimilaritySearchResult object per query sequence. So, if
you have a list of your query sequence accessions in the order they
appear in the input file to BLAST, you can then maintain a counter which
increments each time you obtain the next SeqSimilaritySearchResult, and
that counter will provide a direct pointer into your list to tell you
which query accession you are currently working with. Likewise, you
should know already what blast database you blasted against, so you
shouldn't really need to get this information from the results.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Andreas Scheucher
> Sent: Thursday, November 10, 2005 6:49 PM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] Extract accession number out of xml blast result
> 
> 
> Hi,
> 
> I'am parsing a blast result file for an multi fasta search 
> with biojava.
> Now I'm wondering, whether there really is no possibility to get the 
> accession number out of an blast hit. The xml tag with the 
> information 
> is there but where ist the belonging function?
> 
> Thanks for your effort.
> 
> Regards,
> Andreas
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From matthew.pocock at ncl.ac.uk  Fri Nov 18 06:49:36 2005
From: matthew.pocock at ncl.ac.uk (Matthew Pocock)
Date: Fri Nov 18 06:55:41 2005
Subject: [Biojava-l] biojava workshop
Message-ID: <200511181149.37238.matthew.pocock@ncl.ac.uk>

Hi,

We are going to apply to one of the research councils for some BioJava 
funding. There are now some grants available for supporting tools that 
support research. The plan would be to apply for:

Funding for a workshop:

Bring together people who use and develop BioJava (and other Bio* projects, if 
relevant) to discuss:
  * what they use BioJava for
  * the things it is good for
  * any things that could be improved

This will result in a jointly authored BioJava paper discussing what we have 
learned and where we are going next.


Funding for full-time biojava developers:

We would be looking for two people. One post would concentrate on the core 
coding, and one to work on documentation, the biojava-in-anger and useful 
example applications. Depending on what came out of the workshop, the thrust 
of the developer time may well go into adding support for post-genomics, 
metabolomics and the rest.


It would be realy helpful if people could let me know if you are interested in 
any of this. In particular, letters of support for BioJava would be awesome. 
Let me know if you woud like to attend the workshop, or have strong views 
about the future direction of biojava.

Thanks,

Matthew Pocock

West Suite, 8th Floor
Claremont Tower
School of Computing Science
University of Newcastle upon Tyne
NE1 7RU, United Kingdom
From toddri at eden.rutgers.edu  Mon Nov 21 02:13:52 2005
From: toddri at eden.rutgers.edu (Todd Riley)
Date: Mon Nov 21 02:28:11 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)
Message-ID: <438173B0.9050200@eden.rutgers.edu>

I have built an Profile HMM.  I hand trained it (setting the emission 
and transition distributions by hand) and was able to generate nice 
viterbi scores of fasta sequences.  However, when I tried to perform 
Expectation Maximization using the BaumWelchTrainer and a training set, 
things did not go well at all.  After the iterations are done, all of 
the emission and transition distributions of the now trained model are 
all full of NaN's!!!  (Needless to say, viterbi scoring is now 
impossible. Any attempt to do so generates a NullPointerException on 
line 650 of SingleDP.java in the SingleDP.viterbi() method.)

I looked into the mail archives and found that Fabian Schreiber had the 
exact same problem when he wrote a BaumWelchTrainer program exactly like 
the one from Biojava in Anger: "How do I make a ProfileHMM?".  His 
message is from March 25th of this year (with no replies).

I then decided to download the BioJava 1.4 sources and found 2 
additional (dp) demos that use the BaumWelchTrainer:
    demos/dp/PatternFinder.java
    demos/dp/SearchProfile.java

I compiled and ran both of these demos and found very discouraging 
results.  The iteration scores quickly go to NaN, no matter what 
sequences I train on (including the demos/dp/fake.fasta file).

Is there something that I am missing here?  Is the BaumWelchTrainer 
broken?  Why are all the emission and transition distributions now full 
of all NaN's after training?

Any insight or investigation here would be greatly appreciated.

Thanks,
Todd Riley

I am re-posting Fabian Schreiber's code because it is shorter than 
mine......

//Create Markov Modell - The method createCasino generates an Alphabet 
and sets //the probabilities for the transitions and emissions
MarkovModel casino = createCasino();

DP dp=DPFactory.DEFAULT.createDP(casino);


BaumWelchTrainer bwtrainer = new BaumWelchTrainer(dp);


SequenceDB seqDB = new HashSequenceDB("hashdb");
// here the DB is filled with the sequences --> this works

//Set the stopper
  StoppingCriteria stopper= new StoppingCriteria()
             {public boolean isTrainingComplete(TrainingAlgorithm ta)
             {return (ta.getCycle() > 10);}};
//Train the modell
bwtrainer.train(seqDB, 1.0, stopper);


From mark.schreiber at novartis.com  Mon Nov 21 02:43:32 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Nov 21 02:48:57 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)
Message-ID: <OF82B46BF9.50800E2E-ON482570C0.002A5349-482570C0.002A70EE@EU.novartis.net>

Can you try the code in 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm

I have found in the past that you need to set some intial weights before 
starting the BW trainer. If this example doesn't work please repost to the 
list.

- Mark


Todd Riley <toddri@eden.rutgers.edu>
Sent by: biojava-l-bounces@portal.open-bio.org
11/21/2005 03:13 PM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)


I have built an Profile HMM.  I hand trained it (setting the emission 
and transition distributions by hand) and was able to generate nice 
viterbi scores of fasta sequences.  However, when I tried to perform 
Expectation Maximization using the BaumWelchTrainer and a training set, 
things did not go well at all.  After the iterations are done, all of 
the emission and transition distributions of the now trained model are 
all full of NaN's!!!  (Needless to say, viterbi scoring is now 
impossible. Any attempt to do so generates a NullPointerException on 
line 650 of SingleDP.java in the SingleDP.viterbi() method.)

I looked into the mail archives and found that Fabian Schreiber had the 
exact same problem when he wrote a BaumWelchTrainer program exactly like 
the one from Biojava in Anger: "How do I make a ProfileHMM?".  His 
message is from March 25th of this year (with no replies).

I then decided to download the BioJava 1.4 sources and found 2 
additional (dp) demos that use the BaumWelchTrainer:
    demos/dp/PatternFinder.java
    demos/dp/SearchProfile.java

I compiled and ran both of these demos and found very discouraging 
results.  The iteration scores quickly go to NaN, no matter what 
sequences I train on (including the demos/dp/fake.fasta file).

Is there something that I am missing here?  Is the BaumWelchTrainer 
broken?  Why are all the emission and transition distributions now full 
of all NaN's after training?

Any insight or investigation here would be greatly appreciated.

Thanks,
Todd Riley

I am re-posting Fabian Schreiber's code because it is shorter than 
mine......

//Create Markov Modell - The method createCasino generates an Alphabet 
and sets //the probabilities for the transitions and emissions
MarkovModel casino = createCasino();

DP dp=DPFactory.DEFAULT.createDP(casino);


BaumWelchTrainer bwtrainer = new BaumWelchTrainer(dp);


SequenceDB seqDB = new HashSequenceDB("hashdb");
// here the DB is filled with the sequences --> this works

//Set the stopper
  StoppingCriteria stopper= new StoppingCriteria()
             {public boolean isTrainingComplete(TrainingAlgorithm ta)
             {return (ta.getCycle() > 10);}};
//Train the modell
bwtrainer.train(seqDB, 1.0, stopper);


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From ola.spjuth at farmbio.uu.se  Mon Nov 21 17:09:12 2005
From: ola.spjuth at farmbio.uu.se (Ola Spjuth)
Date: Mon Nov 21 18:06:19 2005
Subject: [Biojava-l] Announcement: Bioclipse
Message-ID: <1132610952.3003.418.camel@zidane>


I would hereby like to announce the development of a new project
entitled Bioclipse[1], aimed at creating a Java-based, open source,
visual platform for chemo- and bioinformatics based on the Eclipse Rich
Client Platform (RCP).

Bioclipse will provide functionality for chemo- and bioinformatics
together with extension points that can easily be extended by plugins to
provide added functionality. Version 1.0 of Bioclipse will consist of a
CDK-plugin based on CDK[2] to provide a chemoinformatic backend, a
JChemPaint-plugin based on JChemPaint[3] for 2D-editing of molecules, a
Jmol-plugin based on Jmol[4] for 3D-visualization, and a BioJava-plugin
based on BioJava[5] to account for sequence management. The goal is to
create a fully integrated workbench with advanced editing and
visualization features for molecules and sequences.

Bioclipse will be released under Eclipse Public License (EPL), setting
no constraints on choice of backend and/or license for third party
extensions; it is totally open for both open source plugins as well as
commercial.

Please visit http://www.bioclipse.net for screenshots and more extensive
documentation. Note that version 0.1.1 that is available for download is
only for demonstration purposes and is in no way a functional release.
On the homepage you can also sign up for one of the mailing-lists to
stay updated on the project.

If you would like to contribute (maybe you have some
algorithm/code/existing app that you would like to integrate or would
like to become a developer) please introduce yourself on the mailinglist
bioclipse-devel. There will soon be docs on the Bioclipse homepage on
how to create your own plugins, and how to integrate existing
applications into the framework. 

   .../Ola Spjuth


[1] http://www.bioclipse.net
[2] http://almost.cubic.uni-koeln.de/cdk
[3] http://almost.cubic.uni-koeln.de/cdk/jcp
[4] http://jmol.org/
[5] http://www.biojava.org/

From toddri at eden.rutgers.edu  Mon Nov 21 18:11:18 2005
From: toddri at eden.rutgers.edu (Todd Riley)
Date: Mon Nov 21 18:08:09 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)
In-Reply-To: <OF82B46BF9.50800E2E-ON482570C0.002A5349-482570C0.002A70EE@EU.novartis.net>
References: <OF82B46BF9.50800E2E-ON482570C0.002A5349-482570C0.002A70EE@EU.novartis.net>
Message-ID: <43825416.1040909@eden.rutgers.edu>

Thanks for your response.  Yes, I did set some initial weights before 
starting the BW trainer.  I copied the snippet that uses 
SimpleModelTrainer directly from the profileHMM.htm page.

I have compiled and run the code from 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm and I get the 
same results as with the other demos and my own code (same result = all 
the distributions are all full of NaN's after BW training.)

This code copied directly from the profileHMM.htm page crashes for me 
(see my output below the code).

Thanks for your assistance,
Todd

*******************************************************************
My file that contains the code from the demo profileHMM.htm found in 
"Biojava In Anger" starts here:
*******************************************************************

/*
 * DemoPHMM.java - Directly from 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
 *
 */

import java.util.*;
import java.io.BufferedReader;
import java.io.FileOutputStream;
import java.io.PrintStream;
import java.io.FileReader;
import java.io.IOException;
import java.util.StringTokenizer;
import java.io.File;
import javax.swing.JFrame;
import java.awt.event.*;

//import biojava.*;
//import biojava.BaumWelchTrainer;
//import biojava.TrainingAlgorithm;
import org.biojava.bio.*;
import org.biojava.bio.dist.*;
import org.biojava.bio.dp.*;
import org.biojava.bio.seq.*;
import org.biojava.bio.seq.db.*;
import org.biojava.bio.seq.io.*;
import org.biojava.bio.symbol.*;
import org.biojava.utils.*;

public class DemoPHMM {

    public static void main(String[] args) throws IOException {
    DemoPHMM hmm = new DemoPHMM();
    hmm.letsDoThis(args);
    }


    public void letsDoThis(String[] args) throws IOException {
    if (args.length < 1 || args[0].equals("-help") || 
args[0].equals("-?")) {
        System.out.println("\n Usage: DemoPHMM <Fasta-Training-Set-File>");
        System.exit(-1);
    }

    String trainingSet=args[0];

    try {
        /*
         * Make a profile HMM over the DNA Alphabet with 12 'columns' 
and default
         * DistributionFactories to construct the transition and emmission
         * Distributions
         */
        ProfileHMM hmm = new ProfileHMM(DNATools.getDNA(),
                        20,
                        DistributionFactory.DEFAULT,
                        DistributionFactory.DEFAULT,
                        "my profilehmm");

        //create the Dynamic Programming matrix for the model.
        DP dp = DPFactory.DEFAULT.createDP(hmm);

        //Database to hold the training set
        //SequenceDB db = new HashSequenceDB();
        //code here to load the training set
        SequenceDB db = 
IOUtility.readSequenceDB(trainingSet,DNATools.getDNA());

        //train the model to have uniform parameters
        ModelTrainer mt = new SimpleModelTrainer();
        //register the model to train
        mt.registerModel(hmm);
        //as no other counts are being used the null weight will cause 
everything to be uniform
        mt.setNullModelWeight(1.0);
        mt.train();

        //create a BW trainer for the dp matrix generated from the HMM
        BaumWelchTrainer bwt = new BaumWelchTrainer(dp);

        //anonymous implementation of the stopping criteria interface to 
stop after 20 iterations
        StoppingCriteria stopper = new StoppingCriteria(){
            public boolean isTrainingComplete(TrainingAlgorithm ta){
            System.out.println("\t\tCycle: " + ta.getCycle() + " score: 
" + ta.getCurrentScore() + " " + (ta.getCurrentScore() - 
ta.getLastScore()) );
            return (ta.getCycle() > 20);
            }
        };
   
        /*
         * optimize the dp matrix to reflect the training set in db 
using a null model
         * weight of 1.0 and the Stopping criteria defined above.
         */
        bwt.train(db,1.0,stopper);

        //SymbolList test = null;
        //code here to initialize the test sequence
        Sequence test = 
DNATools.createDNASequence("tacaGAACATGTCTAAGCATGCTGggga", "mySeq");
   
        /*
         * put the test sequence in an array, an array is used because 
for pairwise
         * alignments using an HMM there would need to be two 
SymbolLists in the
         * array
         */
   
        SymbolList[] sla = {(SymbolList)test};
   
        //decode the most likely state path and produce an 'odds' score
        StatePath path = dp.viterbi(sla, ScoreType.ODDS);
        System.out.println("Log Odds = "+path.getScore());

        //print state path
        for(int i = 1; i <= path.length(); i++){
        System.out.println(path.symbolAt(StatePath.STATES, i).getName());
        }
    }
    catch (Exception ex) {
            ex.printStackTrace();
            //System.err.println("symbol is "+symbol);
            //System.err.println("distribution is 
"+StringUtility.distributionToString(emissionDist));
            System.exit(-1);
    }

    }

}

*******************************************************************
My output from running this code above starts here:
*******************************************************************
        Cycle: 1 score: -1105.9598698420707 -Infinity
        Cycle: 2 score: -1000.3026011513825 105.65726869068817
        Cycle: 3 score: NaN NaN
        Cycle: 4 score: NaN NaN
        Cycle: 5 score: NaN NaN
        Cycle: 6 score: NaN NaN
        Cycle: 7 score: NaN NaN
        Cycle: 8 score: NaN NaN
        Cycle: 9 score: NaN NaN
        Cycle: 10 score: NaN NaN
        Cycle: 11 score: NaN NaN
        Cycle: 12 score: NaN NaN
        Cycle: 13 score: NaN NaN
        Cycle: 14 score: NaN NaN
        Cycle: 15 score: NaN NaN
        Cycle: 16 score: NaN NaN
        Cycle: 17 score: NaN NaN
        Cycle: 18 score: NaN NaN
        Cycle: 19 score: NaN NaN
        Cycle: 20 score: NaN NaN
        Cycle: 21 score: NaN NaN
java.lang.NullPointerException
    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:650)
    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:513)
    at DemoPHMM.letsDoThis(DemoPHMM.java:103)
    at DemoPHMM.main(DemoPHMM.java:33)

*******************************************************************
My fasta training sequence file here:
*******************************************************************
 >Funk_Sequence_1
GGACATGCCCGGGCATGTT
 >Funk_Sequence_2
GAACATGCCCGGGCATGTCT
 >Funk_Sequence_3
GGACATGCCCGGGCATGTCG
 >Funk_Sequence_4
GGGCATGCCCGGGCATGTCT
 >Funk_Sequence_5
GAACATGCCCGGGCATGTCC
 >Funk_Sequence_6
AAACATGCCCGGGCATGTTC
 >Funk_Sequence_7
GGACATGCCCGGGCATGTCT
 >Funk_Sequence_8
GGACATGCCCGGGCATGTCG
 >Funk_Sequence_9
AAACATGCCCGGGCATGCCC
 >Funk_Sequence_10
GGGCATGCCCGGGCATGTTC
 >Funk_Sequence_11
AGACATGCCCGGGCATGTCT
 >Funk_Sequence_12
GGACATGCCCGGGCATGTCT
 >Funk_Sequence_13
GGACATGCCCGGGCATGCCC
 >Funk_Sequence_14
GGACATGTCCGGACATGTTC
 >Funk_Sequence_15
GGACATGTCCGGACATGTCT
 >Funk_Sequence_16
AAACATGTCCGGGCATGTCC
 >Funk_Sequence_17
GGACATGTCCGGGCATGTCT

 >ElnDeiry_Sequence_1
GGGCCTGTCACAGCATGCCT
 >ElnDeiry_Sequence_2
CTGCATGTCTAGGCAAGTCA
 >ElnDeiry_Sequence_3
AAACATGCCCAGACTTGTCT
 >ElnDeiry_Sequence_4
AGGCATGCCTTTGCCT
 >ElnDeiry_Sequence_5
GGGCATGTTTAGGCAAGCTT
 >ElnDeiry_Sequence_6
AGACATGTTATAACAAGTCA
 >ElnDeiry_Sequence_7
TGACATGTCCCGACGTGTTT
 >ElnDeiry_Sequence_8
AGGCATGTTCGGGCTGTCT
 >ElnDeiry_Sequence_9
TGACTTGCCTTGACATGTTC
 >ElnDeiry_Sequence_10
CAGCTGCCAAGGCATGCAG
 >ElnDeiry_Sequence_11
CAACTTGTCTGGACATGTTC
 >ElnDeiry_Sequence_12
AGACAAGCCTGGGCAGGTCC
 >ElnDeiry_Sequence_13
AAACAAGCCCGGATGTGCCC
 >ElnDeiry_Sequence_14
ACACTTGTCTATACCTGCCT
 >ElnDeiry_Sequence_15
AAACATGCTTTGACATGTTC
 >ElnDeiry_Sequence_16
GGACTTGCCCTGGCCAGCCC
 >ElnDeiry_Sequence_17
AGGTTTGCCGGGCTTGTTC
 >ElnDeiry_Sequence_18
TGACTTGCCCAGACATGTTT
 >ElnDeiry_Sequence_19
AAGCATGCCTTGACTTGTTC
 >ElnDeiry_Sequence_20
TGCCTTGCCTGGACTTGCCT


mark.schreiber@novartis.com wrote:

>Can you try the code in 
>http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
>
>I have found in the past that you need to set some intial weights before 
>starting the BW trainer. If this example doesn't work please repost to the 
>list.
>
>- Mark
>
>
>
>  
>
From koeberle at mpiib-berlin.mpg.de  Tue Nov 22 04:58:04 2005
From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-1?Q?Christian_K=F6berle?=)
Date: Tue Nov 22 05:02:50 2005
Subject: [Biojava-l] Annotation
Message-ID: <4382EBAC.8000301@mpiib-berlin.mpg.de>

Hi,

I have a problem with Annotation. If I try to add a new Property to an 
Annotation I get a ChangeVetoException. What can I do?

-- 
Christian K?berle

Max Planck Institute for Infection Biology
Department: Immunology
Schumannstr. 21/22
10117 Berlin

Tel: +49 30 28 460 562
e-mail: koeberle@mpiib-berlin.mpg.de

From koeberle at mpiib-berlin.mpg.de  Tue Nov 22 08:35:02 2005
From: koeberle at mpiib-berlin.mpg.de (=?ISO-8859-1?Q?Christian_K=F6berle?=)
Date: Tue Nov 22 08:33:17 2005
Subject: [Biojava-l] Annotation
In-Reply-To: <4382EBAC.8000301@mpiib-berlin.mpg.de>
References: <4382EBAC.8000301@mpiib-berlin.mpg.de>
Message-ID: <43831E86.2030007@mpiib-berlin.mpg.de>

Christian K?berle wrote:

> Hi,
>
> I have a problem with Annotation. If I try to add a new Property to an 
> Annotation I get a ChangeVetoException. What can I do?
>
Problem is solved:
The Problem was: I have initializide my Object with EMPTY_ANNOTATION.
Now I use new SimpleAnnotation()

-- 
Christian K?berle

Max Planck Institute for Infection Biology
Department: Immunology
Schumannstr. 21/22
10117 Berlin

Tel: +49 30 28 460 562
e-mail: koeberle@mpiib-berlin.mpg.de

From toddri at eden.rutgers.edu  Tue Nov 22 16:03:48 2005
From: toddri at eden.rutgers.edu (Todd Riley)
Date: Tue Nov 22 16:01:16 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)
In-Reply-To: <43825416.1040909@eden.rutgers.edu>
References: <OF82B46BF9.50800E2E-ON482570C0.002A5349-482570C0.002A70EE@EU.novartis.net>
	<43825416.1040909@eden.rutgers.edu>
Message-ID: <438387B4.5080401@eden.rutgers.edu>

I have received info (from at least 3 other people) that have had the 
same problem with the BaumWelchTrainer class.  All three of these 
individuals eventually gave up and went elsewhere (other software) in 
order to perform Baum Welch EM on their models.

There definitely is a problem with the BaumWelchTrainer class.  It's 
either a documentation bug or coding bug.  The demos shipped in the V1.4 
source (demos/dp/PatternFinder.java , demos/dp/SearchProfile.java) don't 
work, and the source code from 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm doesn't work (it 
crashes).

If someone, who has worked with and knows how to get  the 
BaumWelchTrainer object to work, can test the following code (taken 
almost entirely from profileHMM.htm above) on the current release (1.4), 
it would be greatly appreciated.

Thanks in advance!
-Todd

Todd Riley wrote:

> *******************************************************************
> My file that contains the code from the demo profileHMM.htm found in 
> "Biojava In Anger" starts here:
> *******************************************************************
>
> /*
> * DemoPHMM.java - Directly from 
> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
> *
> */
>
> import java.util.*;
> import java.io.BufferedReader;
> import java.io.FileOutputStream;
> import java.io.PrintStream;
> import java.io.FileReader;
> import java.io.IOException;
> import java.util.StringTokenizer;
> import java.io.File;
> import javax.swing.JFrame;
> import java.awt.event.*;
>
> //import biojava.*;
> //import biojava.BaumWelchTrainer;
> //import biojava.TrainingAlgorithm;
> import org.biojava.bio.*;
> import org.biojava.bio.dist.*;
> import org.biojava.bio.dp.*;
> import org.biojava.bio.seq.*;
> import org.biojava.bio.seq.db.*;
> import org.biojava.bio.seq.io.*;
> import org.biojava.bio.symbol.*;
> import org.biojava.utils.*;
>
> public class DemoPHMM {
>
>    public static void main(String[] args) throws IOException {
>    DemoPHMM hmm = new DemoPHMM();
>    hmm.letsDoThis(args);
>    }
>
>
>    public void letsDoThis(String[] args) throws IOException {
>    if (args.length < 1 || args[0].equals("-help") || 
> args[0].equals("-?")) {
>        System.out.println("\n Usage: DemoPHMM 
> <Fasta-Training-Set-File>");
>        System.exit(-1);
>    }
>
>    String trainingSet=args[0];
>
>    try {
>        /*
>         * Make a profile HMM over the DNA Alphabet with 12 'columns' 
> and default
>         * DistributionFactories to construct the transition and emmission
>         * Distributions
>         */
>        ProfileHMM hmm = new ProfileHMM(DNATools.getDNA(),
>                        20,
>                        DistributionFactory.DEFAULT,
>                        DistributionFactory.DEFAULT,
>                        "my profilehmm");
>
>        //create the Dynamic Programming matrix for the model.
>        DP dp = DPFactory.DEFAULT.createDP(hmm);
>
>        //Database to hold the training set
>        //SequenceDB db = new HashSequenceDB();
>        //code here to load the training set
>        SequenceDB db = 
> IOUtility.readSequenceDB(trainingSet,DNATools.getDNA());
>
>        //train the model to have uniform parameters
>        ModelTrainer mt = new SimpleModelTrainer();
>        //register the model to train
>        mt.registerModel(hmm);
>        //as no other counts are being used the null weight will cause 
> everything to be uniform
>        mt.setNullModelWeight(1.0);
>        mt.train();
>
>        //create a BW trainer for the dp matrix generated from the HMM
>        BaumWelchTrainer bwt = new BaumWelchTrainer(dp);
>
>        //anonymous implementation of the stopping criteria interface 
> to stop after 20 iterations
>        StoppingCriteria stopper = new StoppingCriteria(){
>            public boolean isTrainingComplete(TrainingAlgorithm ta){
>            System.out.println("\t\tCycle: " + ta.getCycle() + " score: 
> " + ta.getCurrentScore() + " " + (ta.getCurrentScore() - 
> ta.getLastScore()) );
>            return (ta.getCycle() > 20);
>            }
>        };
>          /*
>         * optimize the dp matrix to reflect the training set in db 
> using a null model
>         * weight of 1.0 and the Stopping criteria defined above.
>         */
>        bwt.train(db,1.0,stopper);
>
>        //SymbolList test = null;
>        //code here to initialize the test sequence
>        Sequence test = 
> DNATools.createDNASequence("tacaGAACATGTCTAAGCATGCTGggga", "mySeq");
>          /*
>         * put the test sequence in an array, an array is used because 
> for pairwise
>         * alignments using an HMM there would need to be two 
> SymbolLists in the
>         * array
>         */
>          SymbolList[] sla = {(SymbolList)test};
>          //decode the most likely state path and produce an 'odds' score
>        StatePath path = dp.viterbi(sla, ScoreType.ODDS);
>        System.out.println("Log Odds = "+path.getScore());
>
>        //print state path
>        for(int i = 1; i <= path.length(); i++){
>        System.out.println(path.symbolAt(StatePath.STATES, i).getName());
>        }
>    }
>    catch (Exception ex) {
>            ex.printStackTrace();
>            //System.err.println("symbol is "+symbol);
>            //System.err.println("distribution is 
> "+StringUtility.distributionToString(emissionDist));
>            System.exit(-1);
>    }
>
>    }
>
> }
>
> *******************************************************************
> My output from running this code above starts here:
> *******************************************************************
>        Cycle: 1 score: -1105.9598698420707 -Infinity
>        Cycle: 2 score: -1000.3026011513825 105.65726869068817
>        Cycle: 3 score: NaN NaN
>        Cycle: 4 score: NaN NaN
>        Cycle: 5 score: NaN NaN
>        Cycle: 6 score: NaN NaN
>        Cycle: 7 score: NaN NaN
>        Cycle: 8 score: NaN NaN
>        Cycle: 9 score: NaN NaN
>        Cycle: 10 score: NaN NaN
>        Cycle: 11 score: NaN NaN
>        Cycle: 12 score: NaN NaN
>        Cycle: 13 score: NaN NaN
>        Cycle: 14 score: NaN NaN
>        Cycle: 15 score: NaN NaN
>        Cycle: 16 score: NaN NaN
>        Cycle: 17 score: NaN NaN
>        Cycle: 18 score: NaN NaN
>        Cycle: 19 score: NaN NaN
>        Cycle: 20 score: NaN NaN
>        Cycle: 21 score: NaN NaN
> java.lang.NullPointerException
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:650)
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:513)
>    at DemoPHMM.letsDoThis(DemoPHMM.java:103)
>    at DemoPHMM.main(DemoPHMM.java:33)
>
> *******************************************************************
> My fasta training sequence file starts here:
> *******************************************************************
> >Funk_Sequence_1
> GGACATGCCCGGGCATGTT
> >Funk_Sequence_2
> GAACATGCCCGGGCATGTCT
> >Funk_Sequence_3
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_4
> GGGCATGCCCGGGCATGTCT
> >Funk_Sequence_5
> GAACATGCCCGGGCATGTCC
> >Funk_Sequence_6
> AAACATGCCCGGGCATGTTC
> >Funk_Sequence_7
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_8
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_9
> AAACATGCCCGGGCATGCCC
> >Funk_Sequence_10
> GGGCATGCCCGGGCATGTTC
> >Funk_Sequence_11
> AGACATGCCCGGGCATGTCT
> >Funk_Sequence_12
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_13
> GGACATGCCCGGGCATGCCC
> >Funk_Sequence_14
> GGACATGTCCGGACATGTTC
> >Funk_Sequence_15
> GGACATGTCCGGACATGTCT
> >Funk_Sequence_16
> AAACATGTCCGGGCATGTCC
> >Funk_Sequence_17
> GGACATGTCCGGGCATGTCT
>
> >ElnDeiry_Sequence_1
> GGGCCTGTCACAGCATGCCT
> >ElnDeiry_Sequence_2
> CTGCATGTCTAGGCAAGTCA
> >ElnDeiry_Sequence_3
> AAACATGCCCAGACTTGTCT
> >ElnDeiry_Sequence_4
> AGGCATGCCTTTGCCT
> >ElnDeiry_Sequence_5
> GGGCATGTTTAGGCAAGCTT
> >ElnDeiry_Sequence_6
> AGACATGTTATAACAAGTCA
> >ElnDeiry_Sequence_7
> TGACATGTCCCGACGTGTTT
> >ElnDeiry_Sequence_8
> AGGCATGTTCGGGCTGTCT
> >ElnDeiry_Sequence_9
> TGACTTGCCTTGACATGTTC
> >ElnDeiry_Sequence_10
> CAGCTGCCAAGGCATGCAG
> >ElnDeiry_Sequence_11
> CAACTTGTCTGGACATGTTC
> >ElnDeiry_Sequence_12
> AGACAAGCCTGGGCAGGTCC
> >ElnDeiry_Sequence_13
> AAACAAGCCCGGATGTGCCC
> >ElnDeiry_Sequence_14
> ACACTTGTCTATACCTGCCT
> >ElnDeiry_Sequence_15
> AAACATGCTTTGACATGTTC
> >ElnDeiry_Sequence_16
> GGACTTGCCCTGGCCAGCCC
> >ElnDeiry_Sequence_17
> AGGTTTGCCGGGCTTGTTC
> >ElnDeiry_Sequence_18
> TGACTTGCCCAGACATGTTT
> >ElnDeiry_Sequence_19
> AAGCATGCCTTGACTTGTTC
> >ElnDeiry_Sequence_20
> TGCCTTGCCTGGACTTGCCT
>
>
> mark.schreiber@novartis.com wrote:
>
>> Can you try the code in 
>> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
>>
>> I have found in the past that you need to set some intial weights 
>> before starting the BW trainer. If this example doesn't work please 
>> repost to the list.
>>
>> - Mark
>>
>>
>>
>>  
>>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l

From chyte374 at student.otago.ac.nz  Tue Nov 22 23:37:51 2005
From: chyte374 at student.otago.ac.nz (Te-yuan Chyou)
Date: Tue Nov 22 23:45:35 2005
Subject: [Biojava-l] install Biojava
Message-ID: <1132720671.4383f21f4dd66@www.studentmail.otago.ac.nz>

Hi:

I'm trying to install Biojava onto MacOS following informations in the
"Getting Started" section of the Biojava website, by putting the following
jar files:

biojava.jar
commons-cli.jar
commons-collections-2.1.jar
commons-dbcp-1.1.jar
commons-pool-1.1.jar
bytecode-0.92.jar

into the directory

Macintosh HD/System/Library/Java/Extensions

Then, while trying to compile the demo code "TestEmbl.java" I get 15
errors and looks like the computer cannot find the packages in Biojava.

At that time I didn't set the CLASSPATH because the website says "It is
also possible to "install" JAR files onto your system by copying them...".

Then I put all the jar files in the directory:
/Users/davidchyou/Desktop/david/
Then set the CLASSPATH by typing:

 export
CLASSPATH=/Users/davidchyou/Desktop/david/biojava.jar:/Users/davidchyou/Desktop/commons-cli.jar:/Users/davidchyou/Desktop/david/commons-collections-2.1.jar:/Users/davidchyou/Desktop/david/commons-dbcp-1.1.jar:/Users/davidchyou/Desktop/david/commons-pool-1.1.jar:

in a single line, and the command works. BUT after that when I compile the
demos again, the computer still produces the same set of error messages
(package not exist).

Could anyone tell me what else I missed out? A step-by-step istallation
guide may be helpful also.

Thanks

Te-yuan (David) Chyou

I attached the error messages I get below:

ou003153:~ davidchyou$ javac Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:5: package org.biojava.bio
does not exist
import org.biojava.bio.*;
^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:6: package
org.biojava.bio.symbol does not exist
import org.biojava.bio.symbol.*;
^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:7: package
org.biojava.bio.seq does not exist
import org.biojava.bio.seq.*;
^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:8: package
org.biojava.bio.seq.io does not exist
import org.biojava.bio.seq.io.*;
^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:18: cannot resolve symbol
symbol ?: class SequenceFormat
location: class seq.TestEmbl
?? ? ?SequenceFormat eFormat = new EmblLikeFormat();
?? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:18: cannot resolve symbol
symbol ?: class EmblLikeFormat
location: class seq.TestEmbl
?? ? ?SequenceFormat eFormat = new EmblLikeFormat();
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: cannot resolve symbol
symbol ?: class SequenceBuilderFactory
location: class seq.TestEmbl
?? ? ?SequenceBuilderFactory sFact = new
EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
?? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: package EmblProcessor
does not exist
?? ? ?SequenceBuilderFactory sFact = new
EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: cannot resolve symbol
symbol ?: variable SimpleSequenceBuilder
location: class seq.TestEmbl
?? ? ?SequenceBuilderFactory sFact = new
EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:22: cannot resolve symbol
symbol ?: class Alphabet
location: class seq.TestEmbl
?? ? ?Alphabet alpha = DNATools.getDNA();
?? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:22: cannot resolve symbol
symbol ?: variable DNATools
location: class seq.TestEmbl
?? ? ?Alphabet alpha = DNATools.getDNA();
?? ? ? ? ? ? ? ? ? ? ? ^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:23: cannot resolve symbol
symbol ?: class SymbolTokenization
location: class seq.TestEmbl
?? ? ?SymbolTokenization rParser = alpha.getTokenization("token");
?? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:24: cannot resolve symbol
symbol ?: class SequenceIterator
location: class seq.TestEmbl
?? ? ?SequenceIterator seqI =
?? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:25: cannot resolve symbol
symbol ?: class StreamReader
location: class seq.TestEmbl
?? ? ? ?new StreamReader(eReader, eFormat, rParser, sFact);
?? ? ? ? ? ?^
Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:28: cannot resolve symbol
symbol ?: class Sequence
location: class seq.TestEmbl
?? ? ? ?Sequence seq = seqI.nextSequence();
?? ? ? ?^
15 errors
ou003153:~ davidchyou$

From hollandr at gis.a-star.edu.sg  Tue Nov 22 23:59:35 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Wed Nov 23 00:10:21 2005
Subject: [Biojava-l] install Biojava
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602656F1C@BIONIC.biopolis.one-north.com>

This is a general Java problem, not a BioJava one. There is no such concept of "installing" a Java library, just the process of adding it to your classpath whenever you need it.

However... It sounds like you are logging out of your shell between each execution, hence it works for your first javac that is run in the same shell as the export command but not when you start another shell and run javac again without doing the export first. The classpath is only valid within your current shell session. If you want it to persist across all shell sessions, you have to add it to your user profile that is run when you start a shell, or to the system profile that is run for everyone when they start a shell. Or, write a wrapper script for javac that sets the classpath each time before calling the real javac.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Te-yuan Chyou
> Sent: Wednesday, November 23, 2005 12:38 PM
> To: biojava-dev@biojava.org
> Cc: biojava-l@biojava.org
> Subject: [Biojava-l] install Biojava
> 
> 
> Hi:
> 
> I'm trying to install Biojava onto MacOS following informations in the
> "Getting Started" section of the Biojava website, by putting 
> the following
> jar files:
> 
> biojava.jar
> commons-cli.jar
> commons-collections-2.1.jar
> commons-dbcp-1.1.jar
> commons-pool-1.1.jar
> bytecode-0.92.jar
> 
> into the directory
> 
> Macintosh HD/System/Library/Java/Extensions
> 
> Then, while trying to compile the demo code "TestEmbl.java" I get 15
> errors and looks like the computer cannot find the packages 
> in Biojava.
> 
> At that time I didn't set the CLASSPATH because the website 
> says "It is
> also possible to "install" JAR files onto your system by 
> copying them...".
> 
> Then I put all the jar files in the directory:
> /Users/davidchyou/Desktop/david/
> Then set the CLASSPATH by typing:
> 
>  export
> CLASSPATH=/Users/davidchyou/Desktop/david/biojava.jar:/Users/d
> avidchyou/Desktop/commons-cli.jar:/Users/davidchyou/Desktop/da
> vid/commons-collections-2.1.jar:/Users/davidchyou/Desktop/davi
> d/commons-dbcp-1.1.jar:/Users/davidchyou/Desktop/david/commons
> -pool-1.1.jar:
> 
> in a single line, and the command works. BUT after that when 
> I compile the
> demos again, the computer still produces the same set of 
> error messages
> (package not exist).
> 
> Could anyone tell me what else I missed out? A step-by-step 
> istallation
> guide may be helpful also.
> 
> Thanks
> 
> Te-yuan (David) Chyou
> 
> I attached the error messages I get below:
> 
> ou003153:~ davidchyou$ javac 
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:5: package 
> org.biojava.bio
> does not exist
> import org.biojava.bio.*;
> ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:6: package
> org.biojava.bio.symbol does not exist
> import org.biojava.bio.symbol.*;
> ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:7: package
> org.biojava.bio.seq does not exist
> import org.biojava.bio.seq.*;
> ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:8: package
> org.biojava.bio.seq.io does not exist
> import org.biojava.bio.seq.io.*;
> ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:18: cannot 
> resolve symbol
> symbol ?: class SequenceFormat
> location: class seq.TestEmbl
> ?? ? ?SequenceFormat eFormat = new EmblLikeFormat();
> ?? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:18: cannot 
> resolve symbol
> symbol ?: class EmblLikeFormat
> location: class seq.TestEmbl
> ?? ? ?SequenceFormat eFormat = new EmblLikeFormat();
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: cannot 
> resolve symbol
> symbol ?: class SequenceBuilderFactory
> location: class seq.TestEmbl
> ?? ? ?SequenceBuilderFactory sFact = new
> EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
> ?? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: package 
> EmblProcessor
> does not exist
> ?? ? ?SequenceBuilderFactory sFact = new
> EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:21: cannot 
> resolve symbol
> symbol ?: variable SimpleSequenceBuilder
> location: class seq.TestEmbl
> ?? ? ?SequenceBuilderFactory sFact = new
> EmblProcessor.Factory(SimpleSequenceBuilder.FACTORY);
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:22: cannot 
> resolve symbol
> symbol ?: class Alphabet
> location: class seq.TestEmbl
> ?? ? ?Alphabet alpha = DNATools.getDNA();
> ?? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:22: cannot 
> resolve symbol
> symbol ?: variable DNATools
> location: class seq.TestEmbl
> ?? ? ?Alphabet alpha = DNATools.getDNA();
> ?? ? ? ? ? ? ? ? ? ? ? ^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:23: cannot 
> resolve symbol
> symbol ?: class SymbolTokenization
> location: class seq.TestEmbl
> ?? ? ?SymbolTokenization rParser = alpha.getTokenization("token");
> ?? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:24: cannot 
> resolve symbol
> symbol ?: class SequenceIterator
> location: class seq.TestEmbl
> ?? ? ?SequenceIterator seqI =
> ?? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:25: cannot 
> resolve symbol
> symbol ?: class StreamReader
> location: class seq.TestEmbl
> ?? ? ? ?new StreamReader(eReader, eFormat, rParser, sFact);
> ?? ? ? ? ? ?^
> Desktop/Biojava-1.4.1/demos/seq/TestEmbl.java:28: cannot 
> resolve symbol
> symbol ?: class Sequence
> location: class seq.TestEmbl
> ?? ? ? ?Sequence seq = seqI.nextSequence();
> ?? ? ? ?^
> 15 errors
> ou003153:~ davidchyou$
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From mark.schreiber at novartis.com  Wed Nov 23 08:35:30 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Nov 23 08:43:47 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)
Message-ID: <OF596DE8CA.D9F022F8-ON482570C2.004A7DF5-482570C2.004AAC32@EU.novartis.net>

Admitidley it's been a very long time since I tried any of these. I'm 
pretty sure they worked way back then?

Matthew, do you have any insights? These are your babies right?

- Mark


Todd Riley <toddri@eden.rutgers.edu>
Sent by: biojava-l-bounces@portal.open-bio.org
11/23/2005 05:03 AM

 
        To:     Mark Schreiber/GP/Novartis@PH, biojava-l@biojava.org
        cc: 
        Subject:        Re: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)


I have received info (from at least 3 other people) that have had the 
same problem with the BaumWelchTrainer class.  All three of these 
individuals eventually gave up and went elsewhere (other software) in 
order to perform Baum Welch EM on their models.

There definitely is a problem with the BaumWelchTrainer class.  It's 
either a documentation bug or coding bug.  The demos shipped in the V1.4 
source (demos/dp/PatternFinder.java , demos/dp/SearchProfile.java) don't 
work, and the source code from 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm doesn't work (it 
crashes).

If someone, who has worked with and knows how to get  the 
BaumWelchTrainer object to work, can test the following code (taken 
almost entirely from profileHMM.htm above) on the current release (1.4), 
it would be greatly appreciated.

Thanks in advance!
-Todd

Todd Riley wrote:

> *******************************************************************
> My file that contains the code from the demo profileHMM.htm found in 
> "Biojava In Anger" starts here:
> *******************************************************************
>
> /*
> * DemoPHMM.java - Directly from 
> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
> *
> */
>
> import java.util.*;
> import java.io.BufferedReader;
> import java.io.FileOutputStream;
> import java.io.PrintStream;
> import java.io.FileReader;
> import java.io.IOException;
> import java.util.StringTokenizer;
> import java.io.File;
> import javax.swing.JFrame;
> import java.awt.event.*;
>
> //import biojava.*;
> //import biojava.BaumWelchTrainer;
> //import biojava.TrainingAlgorithm;
> import org.biojava.bio.*;
> import org.biojava.bio.dist.*;
> import org.biojava.bio.dp.*;
> import org.biojava.bio.seq.*;
> import org.biojava.bio.seq.db.*;
> import org.biojava.bio.seq.io.*;
> import org.biojava.bio.symbol.*;
> import org.biojava.utils.*;
>
> public class DemoPHMM {
>
>    public static void main(String[] args) throws IOException {
>    DemoPHMM hmm = new DemoPHMM();
>    hmm.letsDoThis(args);
>    }
>
>
>    public void letsDoThis(String[] args) throws IOException {
>    if (args.length < 1 || args[0].equals("-help") || 
> args[0].equals("-?")) {
>        System.out.println("\n Usage: DemoPHMM 
> <Fasta-Training-Set-File>");
>        System.exit(-1);
>    }
>
>    String trainingSet=args[0];
>
>    try {
>        /*
>         * Make a profile HMM over the DNA Alphabet with 12 'columns' 
> and default
>         * DistributionFactories to construct the transition and 
emmission
>         * Distributions
>         */
>        ProfileHMM hmm = new ProfileHMM(DNATools.getDNA(),
>                        20,
>                        DistributionFactory.DEFAULT,
>                        DistributionFactory.DEFAULT,
>                        "my profilehmm");
>
>        //create the Dynamic Programming matrix for the model.
>        DP dp = DPFactory.DEFAULT.createDP(hmm);
>
>        //Database to hold the training set
>        //SequenceDB db = new HashSequenceDB();
>        //code here to load the training set
>        SequenceDB db = 
> IOUtility.readSequenceDB(trainingSet,DNATools.getDNA());
>
>        //train the model to have uniform parameters
>        ModelTrainer mt = new SimpleModelTrainer();
>        //register the model to train
>        mt.registerModel(hmm);
>        //as no other counts are being used the null weight will cause 
> everything to be uniform
>        mt.setNullModelWeight(1.0);
>        mt.train();
>
>        //create a BW trainer for the dp matrix generated from the HMM
>        BaumWelchTrainer bwt = new BaumWelchTrainer(dp);
>
>        //anonymous implementation of the stopping criteria interface 
> to stop after 20 iterations
>        StoppingCriteria stopper = new StoppingCriteria(){
>            public boolean isTrainingComplete(TrainingAlgorithm ta){
>            System.out.println("\t\tCycle: " + ta.getCycle() + " score: 
> " + ta.getCurrentScore() + " " + (ta.getCurrentScore() - 
> ta.getLastScore()) );
>            return (ta.getCycle() > 20);
>            }
>        };
>          /*
>         * optimize the dp matrix to reflect the training set in db 
> using a null model
>         * weight of 1.0 and the Stopping criteria defined above.
>         */
>        bwt.train(db,1.0,stopper);
>
>        //SymbolList test = null;
>        //code here to initialize the test sequence
>        Sequence test = 
> DNATools.createDNASequence("tacaGAACATGTCTAAGCATGCTGggga", "mySeq");
>          /*
>         * put the test sequence in an array, an array is used because 
> for pairwise
>         * alignments using an HMM there would need to be two 
> SymbolLists in the
>         * array
>         */
>          SymbolList[] sla = {(SymbolList)test};
>          //decode the most likely state path and produce an 'odds' score
>        StatePath path = dp.viterbi(sla, ScoreType.ODDS);
>        System.out.println("Log Odds = "+path.getScore());
>
>        //print state path
>        for(int i = 1; i <= path.length(); i++){
>        System.out.println(path.symbolAt(StatePath.STATES, i).getName());
>        }
>    }
>    catch (Exception ex) {
>            ex.printStackTrace();
>            //System.err.println("symbol is "+symbol);
>            //System.err.println("distribution is 
> "+StringUtility.distributionToString(emissionDist));
>            System.exit(-1);
>    }
>
>    }
>
> }
>
> *******************************************************************
> My output from running this code above starts here:
> *******************************************************************
>        Cycle: 1 score: -1105.9598698420707 -Infinity
>        Cycle: 2 score: -1000.3026011513825 105.65726869068817
>        Cycle: 3 score: NaN NaN
>        Cycle: 4 score: NaN NaN
>        Cycle: 5 score: NaN NaN
>        Cycle: 6 score: NaN NaN
>        Cycle: 7 score: NaN NaN
>        Cycle: 8 score: NaN NaN
>        Cycle: 9 score: NaN NaN
>        Cycle: 10 score: NaN NaN
>        Cycle: 11 score: NaN NaN
>        Cycle: 12 score: NaN NaN
>        Cycle: 13 score: NaN NaN
>        Cycle: 14 score: NaN NaN
>        Cycle: 15 score: NaN NaN
>        Cycle: 16 score: NaN NaN
>        Cycle: 17 score: NaN NaN
>        Cycle: 18 score: NaN NaN
>        Cycle: 19 score: NaN NaN
>        Cycle: 20 score: NaN NaN
>        Cycle: 21 score: NaN NaN
> java.lang.NullPointerException
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:650)
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:513)
>    at DemoPHMM.letsDoThis(DemoPHMM.java:103)
>    at DemoPHMM.main(DemoPHMM.java:33)
>
> *******************************************************************
> My fasta training sequence file starts here:
> *******************************************************************
> >Funk_Sequence_1
> GGACATGCCCGGGCATGTT
> >Funk_Sequence_2
> GAACATGCCCGGGCATGTCT
> >Funk_Sequence_3
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_4
> GGGCATGCCCGGGCATGTCT
> >Funk_Sequence_5
> GAACATGCCCGGGCATGTCC
> >Funk_Sequence_6
> AAACATGCCCGGGCATGTTC
> >Funk_Sequence_7
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_8
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_9
> AAACATGCCCGGGCATGCCC
> >Funk_Sequence_10
> GGGCATGCCCGGGCATGTTC
> >Funk_Sequence_11
> AGACATGCCCGGGCATGTCT
> >Funk_Sequence_12
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_13
> GGACATGCCCGGGCATGCCC
> >Funk_Sequence_14
> GGACATGTCCGGACATGTTC
> >Funk_Sequence_15
> GGACATGTCCGGACATGTCT
> >Funk_Sequence_16
> AAACATGTCCGGGCATGTCC
> >Funk_Sequence_17
> GGACATGTCCGGGCATGTCT
>
> >ElnDeiry_Sequence_1
> GGGCCTGTCACAGCATGCCT
> >ElnDeiry_Sequence_2
> CTGCATGTCTAGGCAAGTCA
> >ElnDeiry_Sequence_3
> AAACATGCCCAGACTTGTCT
> >ElnDeiry_Sequence_4
> AGGCATGCCTTTGCCT
> >ElnDeiry_Sequence_5
> GGGCATGTTTAGGCAAGCTT
> >ElnDeiry_Sequence_6
> AGACATGTTATAACAAGTCA
> >ElnDeiry_Sequence_7
> TGACATGTCCCGACGTGTTT
> >ElnDeiry_Sequence_8
> AGGCATGTTCGGGCTGTCT
> >ElnDeiry_Sequence_9
> TGACTTGCCTTGACATGTTC
> >ElnDeiry_Sequence_10
> CAGCTGCCAAGGCATGCAG
> >ElnDeiry_Sequence_11
> CAACTTGTCTGGACATGTTC
> >ElnDeiry_Sequence_12
> AGACAAGCCTGGGCAGGTCC
> >ElnDeiry_Sequence_13
> AAACAAGCCCGGATGTGCCC
> >ElnDeiry_Sequence_14
> ACACTTGTCTATACCTGCCT
> >ElnDeiry_Sequence_15
> AAACATGCTTTGACATGTTC
> >ElnDeiry_Sequence_16
> GGACTTGCCCTGGCCAGCCC
> >ElnDeiry_Sequence_17
> AGGTTTGCCGGGCTTGTTC
> >ElnDeiry_Sequence_18
> TGACTTGCCCAGACATGTTT
> >ElnDeiry_Sequence_19
> AAGCATGCCTTGACTTGTTC
> >ElnDeiry_Sequence_20
> TGCCTTGCCTGGACTTGCCT
>
>
> mark.schreiber@novartis.com wrote:
>
>> Can you try the code in 
>> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
>>
>> I have found in the past that you need to set some intial weights 
>> before starting the BW trainer. If this example doesn't work please 
>> repost to the list.
>>
>> - Mark
>>
>>
>>
>> 
>>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From mark.schreiber at novartis.com  Wed Nov 23 08:45:37 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Nov 23 08:43:53 2005
Subject: [Biojava-l] Annotation
Message-ID: <OFACA76EA9.7845508D-ON482570C2.004B6D79-482570C2.004B992C@EU.novartis.net>

OK, that explains it.

EMPTY_ANNOTATION is a place holder to avoid getting a null pointer when 
you ask for the Annotation. It will not let you add properties as it would 
no longer be EMPTY. Best to not use this unless you don't plan to do 
anything with the Annotation.

- Mark


"Christian K?berle" <koeberle@mpiib-berlin.mpg.de>
Sent by: biojava-l-bounces@portal.open-bio.org
11/22/2005 09:35 PM

 
        To:     "Christian K?berle" <koeberle@mpiib-berlin.mpg.de>
        cc:     biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis)
        Subject:        Re: [Biojava-l] Annotation


Christian K?berle wrote:

> Hi,
>
> I have a problem with Annotation. If I try to add a new Property to an 
> Annotation I get a ChangeVetoException. What can I do?
>
Problem is solved:
The Problem was: I have initializide my Object with EMPTY_ANNOTATION.
Now I use new SimpleAnnotation()

-- 
Christian K?berle

Max Planck Institute for Infection Biology
Department: Immunology
Schumannstr. 21/22
10117 Berlin

Tel: +49 30 28 460 562
e-mail: koeberle@mpiib-berlin.mpg.de

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From mark.schreiber at novartis.com  Wed Nov 23 08:43:14 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Nov 23 08:47:56 2005
Subject: [Biojava-l] Annotation
Message-ID: <OF20EC84EF.57EE4CE7-ON482570C2.004B1FAF-482570C2.004B6146@EU.novartis.net>

Hello -

Some Annotations allow new properties. Some (like 
Annotation.EMPTY_ANNOTATION) do not and will throw the exception you get. 
Alternatively you may have added a ChangeListener which is vetoeing your 
change but it seems unlikely you would do this without meaning to.

Can you determine the type of Annotation you have?

seq.getAnnotation().getClass().getName()

This will help us to figure out where the problem is.

- Mark


"Christian K?berle" <koeberle@mpiib-berlin.mpg.de>
Sent by: biojava-l-bounces@portal.open-bio.org
11/22/2005 05:58 PM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Annotation


Hi,

I have a problem with Annotation. If I try to add a new Property to an 
Annotation I get a ChangeVetoException. What can I do?

-- 
Christian K?berle

Max Planck Institute for Infection Biology
Department: Immunology
Schumannstr. 21/22
10117 Berlin

Tel: +49 30 28 460 562
e-mail: koeberle@mpiib-berlin.mpg.de

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From chyte374 at student.otago.ac.nz  Thu Nov 24 05:24:50 2005
From: chyte374 at student.otago.ac.nz (Te-yuan Chyou)
Date: Thu Nov 24 05:22:57 2005
Subject: [Biojava-l] Using the Meme class
Message-ID: <1132827890.438594f26de8a@www.studentmail.otago.ac.nz>

Hi,

I'm trying to use the Meme class in org.biojava.bio.program. Currently I
write an application that will perform Meme analysis (to find conserved
motifs in DNA). I had written the application such that it will read a
fasta format file containing multiple fasta sequences. It runs fine but
the "public List getSeqIDs()" and "public List getMotifs()" is producing
empty lists.

While the API of the Meme class does not contain much information (sorry
about the complain), so I'm not sure about how to use the "InputStream is"
parameter of the Meme constructor. Currently I use

InputStream is = new BufferedInputStream(new
FileInputStream("trialSeqs.fa"));

Is it the correct InputStream object?

An example code of perform Meme analysis will also be appreciated
(currently it is not in the tutorial or the cookbook).

Thanks

Te-yuan (David) Chyou

From martin.eklund at farmbio.uu.se  Thu Nov 24 06:21:59 2005
From: martin.eklund at farmbio.uu.se (Martin Eklund)
Date: Thu Nov 24 07:45:20 2005
Subject: [Biojava-l] BaumWelchTrainer Broken??!!! (please help)
In-Reply-To: <1132827890.438594f26de8a@www.studentmail.otago.ac.nz>
References: <1132827890.438594f26de8a@www.studentmail.otago.ac.nz>
Message-ID: <1132831320.7230.13.camel@localhost>

Hi,

I looked to Biojava to train an HMM given a FASTA alignment file and
then use the HMM to generate 'new' sequences similar to the ones in the
FASTA file. However, I run into exactly the same problem as Todd Riley
(quoted below). Also, if I - despite the training problems - try to
generate a sequence from the HMM I get an uncaught ClassCastException
according to:

Exception in thread "main" java.lang.ClassCastException:
org.biojava.bio.symbol.SimpleBasisSymbol
	at org.biojava.bio.dp.DP.generate(DP.java:594)
	at testHMM.main(testHMM.java:95)

Any ideas about how to resolve these issues?

Thank you!

Martin.


==============================================================

Admitidley it's been a very long time since I tried any of these. I'm 
pretty sure they worked way back then?

Matthew, do you have any insights? These are your babies right?

- Mark


Todd Riley <toddri at eden.rutgers.edu>
Sent by: biojava-l-bounces at portal.open-bio.org
11/23/2005 05:03 AM

 
        To:     Mark Schreiber/GP/Novartis at PH, biojava-l at biojava.org
        cc: 
        Subject:        Re: [Biojava-l] BaumWelchTrainer Broken??!!!  (please help)


I have received info (from at least 3 other people) that have had the 
same problem with the BaumWelchTrainer class.  All three of these 
individuals eventually gave up and went elsewhere (other software) in 
order to perform Baum Welch EM on their models.

There definitely is a problem with the BaumWelchTrainer class.  It's 
either a documentation bug or coding bug.  The demos shipped in the V1.4 
source (demos/dp/PatternFinder.java , demos/dp/SearchProfile.java) don't 
work, and the source code from 
http://www.biojava.org/docs/bj_in_anger/profileHMM.htm doesn't work (it 
crashes).

If someone, who has worked with and knows how to get  the 
BaumWelchTrainer object to work, can test the following code (taken 
almost entirely from profileHMM.htm above) on the current release (1.4), 
it would be greatly appreciated.

Thanks in advance!
-Todd

Todd Riley wrote:

> *******************************************************************
> My file that contains the code from the demo profileHMM.htm found in 
> "Biojava In Anger" starts here:
> *******************************************************************
>
> /*
> * DemoPHMM.java - Directly from 
> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
> *
> */
>
> import java.util.*;
> import java.io.BufferedReader;
> import java.io.FileOutputStream;
> import java.io.PrintStream;
> import java.io.FileReader;
> import java.io.IOException;
> import java.util.StringTokenizer;
> import java.io.File;
> import javax.swing.JFrame;
> import java.awt.event.*;
>
> //import biojava.*;
> //import biojava.BaumWelchTrainer;
> //import biojava.TrainingAlgorithm;
> import org.biojava.bio.*;
> import org.biojava.bio.dist.*;
> import org.biojava.bio.dp.*;
> import org.biojava.bio.seq.*;
> import org.biojava.bio.seq.db.*;
> import org.biojava.bio.seq.io.*;
> import org.biojava.bio.symbol.*;
> import org.biojava.utils.*;
>
> public class DemoPHMM {
>
>    public static void main(String[] args) throws IOException {
>    DemoPHMM hmm = new DemoPHMM();
>    hmm.letsDoThis(args);
>    }
>
>
>    public void letsDoThis(String[] args) throws IOException {
>    if (args.length < 1 || args[0].equals("-help") || 
> args[0].equals("-?")) {
>        System.out.println("\n Usage: DemoPHMM 
> <Fasta-Training-Set-File>");
>        System.exit(-1);
>    }
>
>    String trainingSet=args[0];
>
>    try {
>        /*
>         * Make a profile HMM over the DNA Alphabet with 12 'columns' 
> and default
>         * DistributionFactories to construct the transition and 
emmission
>         * Distributions
>         */
>        ProfileHMM hmm = new ProfileHMM(DNATools.getDNA(),
>                        20,
>                        DistributionFactory.DEFAULT,
>                        DistributionFactory.DEFAULT,
>                        "my profilehmm");
>
>        //create the Dynamic Programming matrix for the model.
>        DP dp = DPFactory.DEFAULT.createDP(hmm);
>
>        //Database to hold the training set
>        //SequenceDB db = new HashSequenceDB();
>        //code here to load the training set
>        SequenceDB db = 
> IOUtility.readSequenceDB(trainingSet,DNATools.getDNA());
>
>        //train the model to have uniform parameters
>        ModelTrainer mt = new SimpleModelTrainer();
>        //register the model to train
>        mt.registerModel(hmm);
>        //as no other counts are being used the null weight will cause 
> everything to be uniform
>        mt.setNullModelWeight(1.0);
>        mt.train();
>
>        //create a BW trainer for the dp matrix generated from the HMM
>        BaumWelchTrainer bwt = new BaumWelchTrainer(dp);
>
>        //anonymous implementation of the stopping criteria interface 
> to stop after 20 iterations
>        StoppingCriteria stopper = new StoppingCriteria(){
>            public boolean isTrainingComplete(TrainingAlgorithm ta){
>            System.out.println("\t\tCycle: " + ta.getCycle() + " score: 
> " + ta.getCurrentScore() + " " + (ta.getCurrentScore() - 
> ta.getLastScore()) );
>            return (ta.getCycle() > 20);
>            }
>        };
>          /*
>         * optimize the dp matrix to reflect the training set in db 
> using a null model
>         * weight of 1.0 and the Stopping criteria defined above.
>         */
>        bwt.train(db,1.0,stopper);
>
>        //SymbolList test = null;
>        //code here to initialize the test sequence
>        Sequence test = 
> DNATools.createDNASequence("tacaGAACATGTCTAAGCATGCTGggga", "mySeq");
>          /*
>         * put the test sequence in an array, an array is used because 
> for pairwise
>         * alignments using an HMM there would need to be two 
> SymbolLists in the
>         * array
>         */
>          SymbolList[] sla = {(SymbolList)test};
>          //decode the most likely state path and produce an 'odds' score
>        StatePath path = dp.viterbi(sla, ScoreType.ODDS);
>        System.out.println("Log Odds = "+path.getScore());
>
>        //print state path
>        for(int i = 1; i <= path.length(); i++){
>        System.out.println(path.symbolAt(StatePath.STATES, i).getName());
>        }
>    }
>    catch (Exception ex) {
>            ex.printStackTrace();
>            //System.err.println("symbol is "+symbol);
>            //System.err.println("distribution is 
> "+StringUtility.distributionToString(emissionDist));
>            System.exit(-1);
>    }
>
>    }
>
> }
>
> *******************************************************************
> My output from running this code above starts here:
> *******************************************************************
>        Cycle: 1 score: -1105.9598698420707 -Infinity
>        Cycle: 2 score: -1000.3026011513825 105.65726869068817
>        Cycle: 3 score: NaN NaN
>        Cycle: 4 score: NaN NaN
>        Cycle: 5 score: NaN NaN
>        Cycle: 6 score: NaN NaN
>        Cycle: 7 score: NaN NaN
>        Cycle: 8 score: NaN NaN
>        Cycle: 9 score: NaN NaN
>        Cycle: 10 score: NaN NaN
>        Cycle: 11 score: NaN NaN
>        Cycle: 12 score: NaN NaN
>        Cycle: 13 score: NaN NaN
>        Cycle: 14 score: NaN NaN
>        Cycle: 15 score: NaN NaN
>        Cycle: 16 score: NaN NaN
>        Cycle: 17 score: NaN NaN
>        Cycle: 18 score: NaN NaN
>        Cycle: 19 score: NaN NaN
>        Cycle: 20 score: NaN NaN
>        Cycle: 21 score: NaN NaN
> java.lang.NullPointerException
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:650)
>    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:513)
>    at DemoPHMM.letsDoThis(DemoPHMM.java:103)
>    at DemoPHMM.main(DemoPHMM.java:33)
>
> *******************************************************************
> My fasta training sequence file starts here:
> *******************************************************************
> >Funk_Sequence_1
> GGACATGCCCGGGCATGTT
> >Funk_Sequence_2
> GAACATGCCCGGGCATGTCT
> >Funk_Sequence_3
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_4
> GGGCATGCCCGGGCATGTCT
> >Funk_Sequence_5
> GAACATGCCCGGGCATGTCC
> >Funk_Sequence_6
> AAACATGCCCGGGCATGTTC
> >Funk_Sequence_7
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_8
> GGACATGCCCGGGCATGTCG
> >Funk_Sequence_9
> AAACATGCCCGGGCATGCCC
> >Funk_Sequence_10
> GGGCATGCCCGGGCATGTTC
> >Funk_Sequence_11
> AGACATGCCCGGGCATGTCT
> >Funk_Sequence_12
> GGACATGCCCGGGCATGTCT
> >Funk_Sequence_13
> GGACATGCCCGGGCATGCCC
> >Funk_Sequence_14
> GGACATGTCCGGACATGTTC
> >Funk_Sequence_15
> GGACATGTCCGGACATGTCT
> >Funk_Sequence_16
> AAACATGTCCGGGCATGTCC
> >Funk_Sequence_17
> GGACATGTCCGGGCATGTCT
>
> >ElnDeiry_Sequence_1
> GGGCCTGTCACAGCATGCCT
> >ElnDeiry_Sequence_2
> CTGCATGTCTAGGCAAGTCA
> >ElnDeiry_Sequence_3
> AAACATGCCCAGACTTGTCT
> >ElnDeiry_Sequence_4
> AGGCATGCCTTTGCCT
> >ElnDeiry_Sequence_5
> GGGCATGTTTAGGCAAGCTT
> >ElnDeiry_Sequence_6
> AGACATGTTATAACAAGTCA
> >ElnDeiry_Sequence_7
> TGACATGTCCCGACGTGTTT
> >ElnDeiry_Sequence_8
> AGGCATGTTCGGGCTGTCT
> >ElnDeiry_Sequence_9
> TGACTTGCCTTGACATGTTC
> >ElnDeiry_Sequence_10
> CAGCTGCCAAGGCATGCAG
> >ElnDeiry_Sequence_11
> CAACTTGTCTGGACATGTTC
> >ElnDeiry_Sequence_12
> AGACAAGCCTGGGCAGGTCC
> >ElnDeiry_Sequence_13
> AAACAAGCCCGGATGTGCCC
> >ElnDeiry_Sequence_14
> ACACTTGTCTATACCTGCCT
> >ElnDeiry_Sequence_15
> AAACATGCTTTGACATGTTC
> >ElnDeiry_Sequence_16
> GGACTTGCCCTGGCCAGCCC
> >ElnDeiry_Sequence_17
> AGGTTTGCCGGGCTTGTTC
> >ElnDeiry_Sequence_18
> TGACTTGCCCAGACATGTTT
> >ElnDeiry_Sequence_19
> AAGCATGCCTTGACTTGTTC
> >ElnDeiry_Sequence_20
> TGCCTTGCCTGGACTTGCCT
>
>
> mark.schreiber at novartis.com wrote:
>
>> Can you try the code in 
>> http://www.biojava.org/docs/bj_in_anger/profileHMM.htm
>>
>> I have found in the past that you need to set some intial weights 
>> before starting the BW trainer. If this example doesn't work please 
>> repost to the list.
>>
>> - Mark
>>
>>
>>
>> 
>>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l

_______________________________________________
Biojava-l mailing list  -  Biojava-l at biojava.org
http://biojava.org/mailman/listinfo/biojava-l
-- 
========================================
Martin Eklund
PhD Student
Department of Pharmaceutical Biosciences
Uppsala University, Sweden
Ph: +46-18-4714281
========================================

From dreher at mpiib-berlin.mpg.de  Thu Nov 24 08:51:06 2005
From: dreher at mpiib-berlin.mpg.de (Felix Dreher)
Date: Thu Nov 24 08:56:49 2005
Subject: [Biojava-l] Error loading ontology terms
Message-ID: <4385C54A.4080806@mpiib-berlin.mpg.de>

Hello,

we have a problem with the connection to a BioSQL Database.
When we start our application for the very first time and try to connect 
to the Database, it throws an SQLException "Connection is closed."
After that, when we simply repeat the process, everything works fine and 
one can store Sequences in the DB.
But when we stop the whole project (which is a JavaServerPages Project 
inside the Tomcat-Container) and restart it, a new error message 
appears: "Error loading ontology terms."

Maybe somebody has a hint....

Greetings,
Felix

From russ at kepler-eng.com  Thu Nov 24 09:54:35 2005
From: russ at kepler-eng.com (Russ Kepler)
Date: Thu Nov 24 10:24:58 2005
Subject: [Biojava-l] Issues with FlexibleAlignment
Message-ID: <200511240754.36096.russ@kepler-eng.com>

In working with FlexibleAlignment I've found a problem.  When an overlap is empty the
underlying code in AbstractULAlignment will return nulls for positions not containing 
symbols, and this confuses things later on.  An example program below will demonstrate 
a simple case of the problem (code stolen from TestSimpleAlignment.java and modified).

Is there a symbol that should be returned for this case?  It should be distinct from the 
gap symbol as areas outside the aligning sequence isn't gap space, it just doesn't exist.
Some comments in the list archive suggest that a space was going to be returned, but
I can't see an attempt to implement that (it's not added to the alphabet, as an example) 
and right now a null is returned.


package symbol;

import java.util.ArrayList;
import java.util.Iterator;

import org.biojava.bio.alignment.FlexibleAlignment;
import org.biojava.bio.alignment.SimpleAlignmentElement;
import org.biojava.bio.symbol.RangeLocation;
import org.biojava.bio.symbol.SymbolList;

public class TestFlexibleAlignment {
    public static void main(String[] args) {
        try {
          // make three random sequences
          SymbolList res1 = Tools.createSymbolList(10);
          SymbolList res2 = Tools.createSymbolList(10);
          SymbolList res3 = Tools.createSymbolList(10);

          // think of three names
          String name1 = "pigs";
          String name2 = "dogs";
          String name3 = "cats";

          // create list with reference sequence
          ArrayList list = new ArrayList(1);
          SymbolList refSeq = Tools.createSymbolList(30);
          list.add(new SimpleAlignmentElement("reference", refSeq, new RangeLocation(1, 30)));
          // create the alignment with the reference sequence
          FlexibleAlignment ali = new FlexibleAlignment(list);

          // add the sequences as alignments
          ali.addSequence(new SimpleAlignmentElement(name1, res1, new RangeLocation(1, 10)));
          ali.addSequence(new SimpleAlignmentElement(name2, res2, new RangeLocation(11, 20)));
          ali.addSequence(new SimpleAlignmentElement(name3, res3, new RangeLocation(21, 30)));

          // print out each row in the alignment
          System.out.println("Sequences in alignment");
          for(Iterator i = ali.getLabels().iterator(); i.hasNext(); ) {
            String label = (String) i.next();
            SymbolList rl = ali.symbolListForLabel(label);
            System.out.println(label + ":\t" + rl.seqString());
          }
          System.out.flush();

          // print out each column
          System.out.println("Columns");
          for(int i = 1; i <= ali.length(); i++) {
            System.out.println(i + ":\t" + ali.symbolAt(i).getName());
          }
        } catch (Exception ex) {
          ex.printStackTrace(System.err);
          System.exit(1);
        }
    }
}
From matthew.pocock at ncl.ac.uk  Thu Nov 24 10:29:39 2005
From: matthew.pocock at ncl.ac.uk (Matthew Pocock)
Date: Thu Nov 24 10:45:44 2005
Subject: [Biojava-l] Error loading ontology terms
In-Reply-To: <4385C54A.4080806@mpiib-berlin.mpg.de>
References: <4385C54A.4080806@mpiib-berlin.mpg.de>
Message-ID: <200511241529.39287.matthew.pocock@ncl.ac.uk>

We had the same problem. We fixed it in a copy of biojava, but my cvs account 
has been suspended. Somebody there with commit privilages that will roll our 
fix back in?

Matthew

On Thursday 24 November 2005 13:51, Felix Dreher wrote:
> Hello,
>
> we have a problem with the connection to a BioSQL Database.
> When we start our application for the very first time and try to connect
> to the Database, it throws an SQLException "Connection is closed."
> After that, when we simply repeat the process, everything works fine and
> one can store Sequences in the DB.
> But when we stop the whole project (which is a JavaServerPages Project
> inside the Tomcat-Container) and restart it, a new error message
> appears: "Error loading ontology terms."
>
> Maybe somebody has a hint....
>
> Greetings,
> Felix
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
From hollandr at gis.a-star.edu.sg  Thu Nov 24 20:58:42 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Thu Nov 24 20:57:21 2005
Subject: [Biojava-l] Error loading ontology terms
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602657062@BIONIC.biopolis.one-north.com>

It was never removed, at least not that I'm aware of. Are you using
biojava-live or biojava-1.4, or even 1.3? It should work in 1.4 and
biojava-live, or at least that's the theory. If not, then could you post
complete stacktraces for each of the two kinds of error you mention?

Note that in release 1.5, the whole of org.biojava.bio.seq.db.biosql and
large parts of org.biojava.bio.seq.io will be deprecated, having been
superceded by various parts of org.biojavax. See the
docs/docbook/BioJavaX.xml DocBook document in biojava-live and the
JavaDocs for org.biojavax for details.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Matthew Pocock
> Sent: Thursday, November 24, 2005 11:30 PM
> To: biojava-l@biojava.org
> Cc: Felix Dreher
> Subject: Re: [Biojava-l] Error loading ontology terms
> 
> 
> We had the same problem. We fixed it in a copy of biojava, 
> but my cvs account 
> has been suspended. Somebody there with commit privilages 
> that will roll our 
> fix back in?
> 
> Matthew
> 
> On Thursday 24 November 2005 13:51, Felix Dreher wrote:
> > Hello,
> >
> > we have a problem with the connection to a BioSQL Database.
> > When we start our application for the very first time and 
> try to connect
> > to the Database, it throws an SQLException "Connection is closed."
> > After that, when we simply repeat the process, everything 
> works fine and
> > one can store Sequences in the DB.
> > But when we stop the whole project (which is a 
> JavaServerPages Project
> > inside the Tomcat-Container) and restart it, a new error message
> > appears: "Error loading ontology terms."
> >
> > Maybe somebody has a hint....
> >
> > Greetings,
> > Felix
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From matthew.pocock at ncl.ac.uk  Fri Nov 25 06:39:12 2005
From: matthew.pocock at ncl.ac.uk (Matthew Pocock)
Date: Fri Nov 25 06:38:44 2005
Subject: [Biojava-l] Issues with FlexibleAlignment
In-Reply-To: <200511240754.36096.russ@kepler-eng.com>
References: <200511240754.36096.russ@kepler-eng.com>
Message-ID: <200511251139.12461.matthew.pocock@ncl.ac.uk>

Hi,

This should be returning the symbol AlphabetManager.getGap(), which should be 
the same as EMPTY_ALPHABET.getGap(). Could somebody fix this?

Matthew

On Thursday 24 November 2005 14:54, Russ Kepler wrote:
> In working with FlexibleAlignment I've found a problem.  When an overlap is
> empty the underlying code in AbstractULAlignment will return nulls for
> positions not containing symbols, and this confuses things later on.  An
> example program below will demonstrate a simple case of the problem (code
> stolen from TestSimpleAlignment.java and modified).
>
> Is there a symbol that should be returned for this case?  It should be
> distinct from the gap symbol as areas outside the aligning sequence isn't
> gap space, it just doesn't exist. Some comments in the list archive suggest
> that a space was going to be returned, but I can't see an attempt to
> implement that (it's not added to the alphabet, as an example) and right
> now a null is returned.
>
>
> package symbol;
>
> import java.util.ArrayList;
> import java.util.Iterator;
>
> import org.biojava.bio.alignment.FlexibleAlignment;
> import org.biojava.bio.alignment.SimpleAlignmentElement;
> import org.biojava.bio.symbol.RangeLocation;
> import org.biojava.bio.symbol.SymbolList;
>
> public class TestFlexibleAlignment {
>     public static void main(String[] args) {
>         try {
>           // make three random sequences
>           SymbolList res1 = Tools.createSymbolList(10);
>           SymbolList res2 = Tools.createSymbolList(10);
>           SymbolList res3 = Tools.createSymbolList(10);
>
>           // think of three names
>           String name1 = "pigs";
>           String name2 = "dogs";
>           String name3 = "cats";
>
>           // create list with reference sequence
>           ArrayList list = new ArrayList(1);
>           SymbolList refSeq = Tools.createSymbolList(30);
>           list.add(new SimpleAlignmentElement("reference", refSeq, new
> RangeLocation(1, 30))); // create the alignment with the reference sequence
>           FlexibleAlignment ali = new FlexibleAlignment(list);
>
>           // add the sequences as alignments
>           ali.addSequence(new SimpleAlignmentElement(name1, res1, new
> RangeLocation(1, 10))); ali.addSequence(new SimpleAlignmentElement(name2,
> res2, new RangeLocation(11, 20))); ali.addSequence(new
> SimpleAlignmentElement(name3, res3, new RangeLocation(21, 30)));
>
>           // print out each row in the alignment
>           System.out.println("Sequences in alignment");
>           for(Iterator i = ali.getLabels().iterator(); i.hasNext(); ) {
>             String label = (String) i.next();
>             SymbolList rl = ali.symbolListForLabel(label);
>             System.out.println(label + ":\t" + rl.seqString());
>           }
>           System.out.flush();
>
>           // print out each column
>           System.out.println("Columns");
>           for(int i = 1; i <= ali.length(); i++) {
>             System.out.println(i + ":\t" + ali.symbolAt(i).getName());
>           }
>         } catch (Exception ex) {
>           ex.printStackTrace(System.err);
>           System.exit(1);
>         }
>     }
> }
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
From russ at kepler-eng.com  Fri Nov 25 09:34:01 2005
From: russ at kepler-eng.com (Russ Kepler)
Date: Fri Nov 25 09:32:13 2005
Subject: [Biojava-l] Issues with FlexibleAlignment
In-Reply-To: <200511251139.12461.matthew.pocock@ncl.ac.uk>
References: <200511240754.36096.russ@kepler-eng.com>
	<200511251139.12461.matthew.pocock@ncl.ac.uk>
Message-ID: <200511250734.01497.russ@kepler-eng.com>

On Friday 25 November 2005 04:39 am, Matthew Pocock wrote:

> This should be returning the symbol AlphabetManager.getGap(), which should
> be the same as EMPTY_ALPHABET.getGap(). Could somebody fix this?

I wouldn't think that a gap symbol would be appropriate for the areas outside 
the sequence to sequence overlap.  Something that displays as a space or 
something would be a lot more appropriate, but my lack of experience in the 
Symbol package prevents me from finding it.
From matthew.pocock at ncl.ac.uk  Fri Nov 25 09:57:42 2005
From: matthew.pocock at ncl.ac.uk (Matthew Pocock)
Date: Fri Nov 25 10:10:26 2005
Subject: [Biojava-l] Issues with FlexibleAlignment
In-Reply-To: <200511250734.01497.russ@kepler-eng.com>
References: <200511240754.36096.russ@kepler-eng.com>
	<200511251139.12461.matthew.pocock@ncl.ac.uk>
	<200511250734.01497.russ@kepler-eng.com>
Message-ID: <200511251457.42598.matthew.pocock@ncl.ac.uk>

On Friday 25 November 2005 14:34, Russ Kepler wrote:
> On Friday 25 November 2005 04:39 am, Matthew Pocock wrote:
> > This should be returning the symbol AlphabetManager.getGap(), which
> > should be the same as EMPTY_ALPHABET.getGap(). Could somebody fix this?
>
> I wouldn't think that a gap symbol would be appropriate for the areas
> outside the sequence to sequence overlap.  Something that displays as a
> space or something would be a lot more appropriate, but my lack of
> experience in the Symbol package prevents me from finding it.

The EMPTY_ALPHABET.getGap() symbol is not the same as the gap symbol used 
within a sequence. It represents an unocupiable position in the sequence. It 
is like the ~ symbol you see in multi-fasta at the beginning and end of 
sequences vs - in the middle of sequences. In an end-user applications, you 
could choose to render ~ as a space, or an empty box or whatever. Kalle has 
been fixing the serialization of these guys, and I beleive they get tokenized 
correctly to/from ascii.

Matthew

> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
From fpepin at cs.mcgill.ca  Fri Nov 25 11:15:34 2005
From: fpepin at cs.mcgill.ca (Francois Pepin)
Date: Fri Nov 25 11:26:24 2005
Subject: [Biojava-l] Error loading ontology terms
In-Reply-To: <200511241529.39287.matthew.pocock@ncl.ac.uk>
References: <4385C54A.4080806@mpiib-berlin.mpg.de>
	<200511241529.39287.matthew.pocock@ncl.ac.uk>
Message-ID: <1132935334.9053.187.camel@elm.mcb.mcgill.ca>

I've applied the patch in the cvs. It compiles and runs the test but I
don't have a bioSQL database handy to test that particular bug.

Francois

On Thu, 2005-11-24 at 15:29 +0000, Matthew Pocock wrote:
> We had the same problem. We fixed it in a copy of biojava, but my cvs account 
> has been suspended. Somebody there with commit privilages that will roll our 
> fix back in?
> 
> Matthew
> 
> On Thursday 24 November 2005 13:51, Felix Dreher wrote:
> > Hello,
> >
> > we have a problem with the connection to a BioSQL Database.
> > When we start our application for the very first time and try to connect
> > to the Database, it throws an SQLException "Connection is closed."
> > After that, when we simply repeat the process, everything works fine and
> > one can store Sequences in the DB.
> > But when we stop the whole project (which is a JavaServerPages Project
> > inside the Tomcat-Container) and restart it, a new error message
> > appears: "Error loading ontology terms."
> >
> > Maybe somebody has a hint....
> >
> > Greetings,
> > Felix
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From mark.schreiber at novartis.com  Sun Nov 27 01:54:13 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Sun Nov 27 01:52:33 2005
Subject: [Biojava-l] Using the Meme class
Message-ID: <OF460155DA.648EBCC1-ON482570C6.0025DC9B-482570C6.0025EFDB@EU.novartis.net>

i think the input stream should be reading the output file of Meme.

- Mark


Te-yuan Chyou <chyte374@student.otago.ac.nz>
Sent by: biojava-l-bounces@portal.open-bio.org
11/24/2005 06:24 PM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Using the Meme class


Hi,

I'm trying to use the Meme class in org.biojava.bio.program. Currently I
write an application that will perform Meme analysis (to find conserved
motifs in DNA). I had written the application such that it will read a
fasta format file containing multiple fasta sequences. It runs fine but
the "public List getSeqIDs()" and "public List getMotifs()" is producing
empty lists.

While the API of the Meme class does not contain much information (sorry
about the complain), so I'm not sure about how to use the "InputStream is"
parameter of the Meme constructor. Currently I use

InputStream is = new BufferedInputStream(new
FileInputStream("trialSeqs.fa"));

Is it the correct InputStream object?

An example code of perform Meme analysis will also be appreciated
(currently it is not in the tutorial or the cookbook).

Thanks

Te-yuan (David) Chyou

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From ola.spjuth at farmbio.uu.se  Mon Nov 28 04:54:37 2005
From: ola.spjuth at farmbio.uu.se (Ola Spjuth)
Date: Mon Nov 28 05:51:38 2005
Subject: [Biojava-l] Multiple questions
Message-ID: <1133171676.2994.75.camel@zidane>

Hi,

I am investigating the usefulness of BioJava as a backend for sequence
management in Bioclipse (www.bioclipse.net). As a total newbie to
Biojava, I have read the tutorial, BIA examples, glanced at the API,
read my first FASTA-sequence and have come up with a few questions:

1) Is it possible to search the Biojava-l archives without having to
manually browse by month?

2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
automatically detects file formats or is it necessary to distinguish
sequence formats externally, i.e. with different file-extensions? If so,
does anyone know of a complete list of file-extensions that could be
mapped to a format?

3) How robust are the I/O-classes for different formats? The
test-library provided is rather short in my opinion and my first test
broke since there was a space in the wrong position...

4) What are the capabilities for multiple sequence alignment in Biojava?
Is it limited to parse results into Biojava objects (as in BIA) or does
it contain any stable MSA-implementations? Due to BioJavas size it is
not easy to get an overview of the current capabilities and the standard
of different parts.

5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
public web-services running for this?

6) Is there some example-code on how to use DAS (as a client)?

7) How can I submit an RFE?

Sorry for so many questions in one post; I have a lot of catching up to
do and was hoping for some guidance. Some answers have probably already
been answered in earlier posts but I have not been able to search the
archives.

Cheers,

   .../Ola


From russ at kepler-eng.com  Mon Nov 28 11:43:20 2005
From: russ at kepler-eng.com (Russ Kepler)
Date: Mon Nov 28 11:41:08 2005
Subject: [Biojava-l] Multiple alignment display issues
Message-ID: <200511280943.20517.russ@kepler-eng.com>

Further in my efforts in using the FlexibleAlignment class in conjunction with 
the display the alignments I have created some test code, reproduced below.  
The sequences are all displaying, but not in the alignment range.  Am I 
simply missing something simple here, like a symbol renderer that handles a 
range?  Or should someone be rewriting the context before the renderer is 
called?  

(Apologies for the form - I've had to maintain this in a development 
environment to use my modified BioJava ).

import java.awt.*;
import java.util.*;
import java.util.List;

import javax.swing.*;

import org.biojava.bio.alignment.*;
import org.biojava.bio.gui.sequence.*;
import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;

public class MultiAlignViewerDemo {
  public static void main(String[] args) {
    //MultiAlignViewer multialignviewer = new MultiAlignViewer();
    try {

      MultiLineRenderer mlr = new MultiLineRenderer();
      SequenceRenderer subSeqRend = new SymbolSequenceRenderer();
      GappedRenderer seqRend = new GappedRenderer(subSeqRend);

      // make three random sequences
      SymbolList res1 = createSymbolList(10);
      SymbolList res2 = createSymbolList(10);
      SymbolList res3 = createSymbolList(10);

      // think of three names
      String name1 = "pigs";
      String name2 = "dogs";
      String name3 = "cats";

      // create list with reference sequence
      ArrayList list = new ArrayList(1);
      SymbolList refSeq = createSymbolList(30);
      list.add(new SimpleAlignmentElement("reference", refSeq, new 
RangeLocation(1, 30)));
      // create the alignment with the reference sequence
      FlexibleAlignment alignment = new FlexibleAlignment(list);

      list = new ArrayList(3);

      // add the sequences as alignments
      alignment.addSequence(new SimpleAlignmentElement(name1, res1, new 
RangeLocation(1, 10)));
      AlignmentRenderer ar1 = new AlignmentRenderer();
      ar1.setLabel("pigs");
      ar1.setRenderer(seqRend);
      mlr.addRenderer(ar1);
      alignment.addSequence(new SimpleAlignmentElement(name2, res2, new 
RangeLocation(11, 20)));
      AlignmentRenderer ar2 = new AlignmentRenderer();
      ar2.setLabel("dogs");
      ar2.setRenderer(seqRend);
      mlr.addRenderer(ar2);
      alignment.addSequence(new SimpleAlignmentElement(name3, res3, new 
RangeLocation(21, 30)));
      AlignmentRenderer ar3 = new AlignmentRenderer();
      ar3.setLabel("cats");
      ar3.setRenderer(seqRend);
      mlr.addRenderer(ar3);

      // add the reference sequence to the render list
      AlignmentRenderer arRef = new AlignmentRenderer();
      arRef.setLabel("reference");
      arRef.setRenderer(seqRend);
      mlr.addRenderer(arRef);
      mlr.addRenderer(new RulerRenderer());

      // we will display this in a window
      JFrame frame = new JFrame("Multialign Viewer Demo");
      frame.setSize(400, 300);

      TranslatedSequencePanel sp = new TranslatedSequencePanel();
      sp.setSequence(alignment);
      sp.setScale(7.0);
      sp.setDirection(TranslatedSequencePanel.HORIZONTAL);
      sp.setRenderer(mlr);
      frame.getContentPane().setLayout(new BorderLayout());
      frame.getContentPane().add(sp, BorderLayout.CENTER);

      frame.setVisible(true);
    } catch (Exception ex) {
      ex.printStackTrace(System.err);
      System.exit(1);
    }
  }
  /**
   * Common stuff that the demos rely on.
   *
   * @author Matthew Pocock
   */

    public static SymbolList createSymbolList(int length) throws 
IllegalSymbolException {
      List l = new ArrayList(length);
      for (int i = 0; i < length; i++) {
        l.add(DNATools.forIndex((int) (4.0 * Math.random())));
      }
      return new SimpleSymbolList(DNATools.getDNA(), l);
    }

}
From ola.spjuth at farmbio.uu.se  Mon Nov 28 12:55:51 2005
From: ola.spjuth at farmbio.uu.se (Ola Spjuth)
Date: Mon Nov 28 13:52:57 2005
Subject: [Biojava-l] DAS questions
Message-ID: <1133200551.2994.262.camel@zidane>

Hi,

Experimenting with the DAS client to retrieve proteins from uniprot I
have a few questions:

dbURL = new URL("http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/");
String seqName="Q12345";
DASSequenceDB dasDB = new DASSequenceDB(dbURL);
DASSequence dasSeq = (DASSequence) dasDB.getSequence(seqName);

gives the error:

org.biojava.bio.seq.db.IllegalIDException: Database does not contain
Q12345 as a top-level sequence

but with a browser issuing
http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/sequence?segment=Q12345
yields the result I am looking for.

I assume I am not using DASSequenceDB and DASSequence correctly. I have
not found any JUnit tests so I have no reference code. Could someone
please assist?

On a general basis: Is the DAS client in BioJava mature? Can it handle
all the major DAS-servers for proteins/genes?

Thanks,

   .../Ola


From ftdgc1 at uaf.edu  Mon Nov 28 18:08:46 2005
From: ftdgc1 at uaf.edu (Dan Cardin)
Date: Mon Nov 28 18:17:21 2005
Subject: [Biojava-l] edit problems with SimpleGappedSequence
Message-ID: <49411.137.229.51.156.1133219326.squirrel@ftdgc1.email.uaf.edu>

Hello all,

This question refers to a previous posted question  [Biojava-l] removeGap
problem with SimpleGappedSequence back in feb 2004. I was curious if
anyone had a workaround to the problem associated with trying to edit
already gapped sequences using the SimpleGappedSequence class. I am
hitting a wall here and am just starting to learn the biojava
architecture. Any suggestions would be greatly appreciated. Thank you

-dc


From mark.schreiber at novartis.com  Mon Nov 28 20:39:25 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Nov 28 20:37:27 2005
Subject: [Biojava-l] Multiple questions
Message-ID: <OF9391904E.098C41B1-ON482570C8.00052756-482570C8.000919F7@EU.novartis.net>

>I am investigating the usefulness of BioJava as a backend for sequence
>management in Bioclipse (www.bioclipse.net). As a total newbie to
>Biojava, I have read the tutorial, BIA examples, glanced at the API,
>read my first FASTA-sequence and have come up with a few questions:
>
>1) Is it possible to search the Biojava-l archives without having to
>manually browse by month?

This page is a search index for all of the open-bio hosted websites:
http://search.open-bio.org/cgi-bin/obf-search.cgi

This page is just for searching mailing list archives:
http://search.open-bio.org/cgi-bin/mail-search.cgi

>2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
>automatically detects file formats or is it necessary to distinguish
>sequence formats externally, i.e. with different file-extensions? If so,
>does anyone know of a complete list of file-extensions that could be
>mapped to a format?

There used to be but it is deprecated as there is no fool proof way to 
guess formats. I would suggest you adopt your own conventions for 
bioeclipse. There are no standard file extensions either however I try and 
use .gb for genbank. .fna for Fasta DNA .faa for Fasta aminoacid, .fra for 
Fasta rna etc. Again for bioeclipse you could define your own 
expectations.

>3) How robust are the I/O-classes for different formats? The
>test-library provided is rather short in my opinion and my first test
>broke since there was a space in the wrong position...

In my opinion they are poor. The newer parsers in org.biojavax (available 
only in CVS at this stage) are much better and have survived some stress 
testing. Robustness is an issue, sometimes a file that claims to be in one 
format doesn't really follow the conventions so technically isn't in that 
format. Were these are found we try to allow them if we can. I have found 
a few Genbank files that don't really seem to follow genbanks own 
conventions.

>4) What are the capabilities for multiple sequence alignment in Biojava?
>Is it limited to parse results into Biojava objects (as in BIA) or does
>it contain any stable MSA-implementations? Due to BioJavas size it is
>not easy to get an overview of the current capabilities and the standard
>of different parts.

Biojava cannot do anything other than pairwise alignments although there 
is no reason why you could not implement an algorithm.

>5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
>public web-services running for this?

Not to my knowledge. There are some similar programs in Java but not those 
ones. There are blast webservices at NCBI (e-utils) and probably EBI. Not 
sure about ClustalW

>6) Is there some example-code on how to use DAS (as a client)?

take a look at http://www.biojava.org/dazzle/

>7) How can I submit an RFE?

No formal proceedure, just post to the list. Even better code it yourself 
and post it to the list :)


- Mark


From mark.schreiber at novartis.com  Mon Nov 28 20:41:46 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Nov 28 20:39:38 2005
Subject: [Biojava-l] edit problems with SimpleGappedSequence
Message-ID: <OF2574CFD3.E966303F-ON482570C8.00093A55-482570C8.0009515A@EU.novartis.net>

Hi -

Could you repost some example code of the problem and the version of 
biojava. Sorry for the tardy responses, everyone seems to be very busy 
this year.

- Mark


"Dan Cardin" <ftdgc1@uaf.edu>
Sent by: biojava-l-bounces@portal.open-bio.org
11/29/2005 07:08 AM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] edit problems with SimpleGappedSequence


Hello all,

This question refers to a previous posted question  [Biojava-l] removeGap
problem with SimpleGappedSequence back in feb 2004. I was curious if
anyone had a workaround to the problem associated with trying to edit
already gapped sequences using the SimpleGappedSequence class. I am
hitting a wall here and am just starting to learn the biojava
architecture. Any suggestions would be greatly appreciated. Thank you

-dc


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From hollandr at gis.a-star.edu.sg  Mon Nov 28 20:52:09 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Mon Nov 28 20:50:35 2005
Subject: [Biojava-l] Multiple questions
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602894B07@BIONIC.biopolis.one-north.com>

Hello Ola,

> 1) Is it possible to search the Biojava-l archives without having to
> manually browse by month?

Not that I know of. Google is a pretty good alternative as it has
indexed all the archive - just add +"biojava-l" to the beginning of your
query.
 
> 2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
> automatically detects file formats or is it necessary to distinguish
> sequence formats externally, i.e. with different 
> file-extensions? If so,
> does anyone know of a complete list of file-extensions that could be
> mapped to a format?

Yes, there is, but it's not very good and will soon be deprecated.
Programs really should know what they are expecting to read and complain
if they don't find it.
 
> 3) How robust are the I/O-classes for different formats? The
> test-library provided is rather short in my opinion and my first test
> broke since there was a space in the wrong position...

Depends on the format. In the case of GenBank and EMBL, they are very
stringent about indent sizes because the parsers rely on indents to
identify sections of the files. Different size indents mean different
things. However elsewhere extra spaces should not be a problem. I
haven't really tried that hard yet to break them so don't know exactly.

Incidentally the org.biojavax packages provide a whole new set of
parsers and an updated object model for storing the information that has
been loaded. See the DocBook-format documentation under
"docs/docbook/BioJavaX.xml" in the biojava-live CVS repository. The
comments above refer to these new parsers. The old (1.4) ones really
should not be relied upon for detailed work.
 
> 4) What are the capabilities for multiple sequence alignment in
Biojava?
> Is it limited to parse results into Biojava objects (as in > BIA) or
does
> it contain any stable MSA-implementations? Due to BioJavas size it is
> not easy to get an overview of the current capabilities and 
> the standard of different parts.

It can only read alignments. As yet it cannot generate any itself.

> 5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
> public web-services running for this?

Not that I know of. Blast works very well as an external process fired
off by Java and with standard output piped directly back into Java. I
can't see the reason why you'd want to re-implement it unless some
significant change to the algorithm made it necessary. ClustalW
likewise.
 
> 6) Is there some example-code on how to use DAS (as a client)?

Don't know.
 
> 7) How can I submit an RFE?

For now, send it to this list.
 
Thanks for the interest anyhow - sorry I couldn't answer all your
questions at once! I can definitely recommend investigating the newer
bits of BioJava though (BioJavaX), although they won't officially be
part of it until the 1.5 release, and no date has been set for that yet.

cheers,
Richard

From hollandr at gis.a-star.edu.sg  Mon Nov 28 21:04:19 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Mon Nov 28 21:02:47 2005
Subject: [Biojava-l] DAS questions
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602894B0A@BIONIC.biopolis.one-north.com>

If it's any help, you can see what IDs the DAS source is providing by
doing this:

	// construct your DASSequenceDB (dasDB) object as you did before
	Set ids = dasDB.ids();

	// Does the set contain Q12345?
	if (ids.contains("Q12345")) {
		System.out.println("Yes, it does!");
	} else {
		System.out.println("No, it doesn't!");
	}

	// Iterate through the set and print them all out
	for (Iterator i = ids.iterator(); i.hasNext(); ) {
		System.out.println((String)i.next());
	}

Hopefully you will be able to find out whats going on by investigating
the contents of this set.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of Ola Spjuth
> Sent: Tuesday, November 29, 2005 1:56 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] DAS questions
> 
> 
> Hi,
> 
> Experimenting with the DAS client to retrieve proteins from uniprot I
> have a few questions:
> 
> dbURL = new 
> URL("http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/");
> String seqName="Q12345";
> DASSequenceDB dasDB = new DASSequenceDB(dbURL);
> DASSequence dasSeq = (DASSequence) dasDB.getSequence(seqName);
> 
> gives the error:
> 
> org.biojava.bio.seq.db.IllegalIDException: Database does not contain
> Q12345 as a top-level sequence
> 
> but with a browser issuing
> http://www.ebi.ac.uk/das-srv/uniprot/das/aristotle/sequence?se
gment=Q12345
yields the result I am looking for.

I assume I am not using DASSequenceDB and DASSequence correctly. I have
not found any JUnit tests so I have no reference code. Could someone
please assist?

On a general basis: Is the DAS client in BioJava mature? Can it handle
all the major DAS-servers for proteins/genes?

Thanks,

   .../Ola


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

From ap3 at sanger.ac.uk  Tue Nov 29 04:35:47 2005
From: ap3 at sanger.ac.uk (Andreas Prlic)
Date: Tue Nov 29 05:01:13 2005
Subject: [Biojava-l] DAS questions
In-Reply-To: <1133200551.2994.262.camel@zidane>
References: <1133200551.2994.262.camel@zidane>
Message-ID: <540fe1fe6bc19a67bc6189722f5a52f1@sanger.ac.uk>

Hi Ola!

you might also want to have a look at how the SPICE - browser does the  
das communication.

http://www.efamily.org.uk/software/dasclients/spice/ 
devel_dasrequests.shtml

Cheers,
Andreas


-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891

From kalle.naslund at genpat.uu.se  Tue Nov 29 08:34:06 2005
From: kalle.naslund at genpat.uu.se (=?ISO-8859-1?Q?Kalle_N=E4slund?=)
Date: Tue Nov 29 08:50:17 2005
Subject: [Biojava-l] Multiple questions
In-Reply-To: <1133171676.2994.75.camel@zidane>
References: <1133171676.2994.75.camel@zidane>
Message-ID: <438C58CE.80704@genpat.uu.se>

Ola Spjuth wrote:

>Hi,
>
>I am investigating the usefulness of BioJava as a backend for sequence
>management in Bioclipse (www.bioclipse.net). As a total newbie to
>Biojava, I have read the tutorial, BIA examples, glanced at the API,
>read my first FASTA-sequence and have come up with a few questions:
>
>1) Is it possible to search the Biojava-l archives without having to
>manually browse by month?
>
>2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
>automatically detects file formats or is it necessary to distinguish
>sequence formats externally, i.e. with different file-extensions? If so,
>does anyone know of a complete list of file-extensions that could be
>mapped to a format?
>  
>
There is a deprecated piece of code available, that quite many people 
actualy use
in their code still. Even though it might not be the greatest thing to 
try to auto
guess file format, its the desireable thing to do in many cases.
If i just look at people in my lab, they want to open the file, they 
dont want to keep
track of what file format that particular sequence was in, and so on.

So, even if file format guessing is bad, people are going to write it, 
and imho its
better to have one centralised good, known to work file guesser, then 
several
different implementations that differ in each persons own application.

So, my suggestion is to start with using the deprecated version thats in 
biojava, if
it gets removed you can easily just copy that small part of the code 
into your own
application, or as an external little jarfile.

>3) How robust are the I/O-classes for different formats? The
>test-library provided is rather short in my opinion and my first test
>broke since there was a space in the wrong position...
>
>4) What are the capabilities for multiple sequence alignment in Biojava?
>Is it limited to parse results into Biojava objects (as in BIA) or does
>it contain any stable MSA-implementations? Due to BioJavas size it is
>not easy to get an overview of the current capabilities and the standard
>of different parts.
>  
>

There is some support for multiple alignments in biojava. The Alignment 
interface
and implementations happily handle multiple alignments. And you can 
choose how
to interpret it, either as SymbolList over a crossproduct alphabet, or 
as individual
sequences accessable by some label.

There is a basic framework for handling multiple alignment formats in 
the biojava
org.biojava.bio.seq.io package. It currently only implements two 
formats, FASTA
and MSF. Most programs seem to be able to generate multiple alignment 
output
into either FASTA or MSF format so you should be able to get the results 
into
biojava.

>5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
>public web-services running for this?
>
>  
>
I have been told by greater deities that implementing BLAST in java is 
hard, because
the blast algorithm makes heavy use of low level data structures, 
pointers ? and similar
things that are very hard to implement and controll in java. So the 
resulting implementation
would most likely run pretty darn slow, and not do what you want.

Depending on what you want to do with BLAST, the biojava SSAHA 
implementation
might be something you can use instead ( it works pretty ok on quite 
conserved sequences,
but its not realy suited for more divergent sequences )

When it comes to webservices i just know of a few things, i have not 
used any of these
to an large extent, so i cant comment on how well they work for large 
sequences, big
jobs and so on.

http://www.ebi.ac.uk/Tools/webservices/services.html
http://xml.ddbj.nig.ac.jp/wsdl/index.jsp

Sadly they all use their own data encoding and service invocation setup, 
so its pretty darn
annoying to use.


>6) Is there some example-code on how to use DAS (as a client)?
>
>7) How can I submit an RFE?
>
>Sorry for so many questions in one post; I have a lot of catching up to
>do and was hoping for some guidance. Some answers have probably already
>been answered in earlier posts but I have not been able to search the
>archives.
>
>Cheers,
>
>   .../Ola
>
>
>
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l@biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
>  
>

From ola.spjuth at farmbio.uu.se  Tue Nov 29 08:54:51 2005
From: ola.spjuth at farmbio.uu.se (Ola Spjuth)
Date: Tue Nov 29 09:09:35 2005
Subject: [Biojava-l] Multiple questions
In-Reply-To: <438C58CE.80704@genpat.uu.se>
References: <1133171676.2994.75.camel@zidane>  <438C58CE.80704@genpat.uu.se>
Message-ID: <1133272491.2844.43.camel@zidane>

Thanks for all the comments, it should get me started.
I do not intend to write a file-format guesser (out of my scope) but I
would like to make an RFE for it here and now. I think many people would
benefit from it.

Cheers,

   .../Ola


On Tue, 2005-11-29 at 14:34, Kalle N?slund wrote:
> Ola Spjuth wrote:
> 
> >Hi,
> >
> >I am investigating the usefulness of BioJava as a backend for sequence
> >management in Bioclipse (www.bioclipse.net). As a total newbie to
> >Biojava, I have read the tutorial, BIA examples, glanced at the API,
> >read my first FASTA-sequence and have come up with a few questions:
> >
> >1) Is it possible to search the Biojava-l archives without having to
> >manually browse by month?
> >
> >2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
> >automatically detects file formats or is it necessary to distinguish
> >sequence formats externally, i.e. with different file-extensions? If so,
> >does anyone know of a complete list of file-extensions that could be
> >mapped to a format?
> >  
> >
> There is a deprecated piece of code available, that quite many people 
> actualy use
> in their code still. Even though it might not be the greatest thing to 
> try to auto
> guess file format, its the desireable thing to do in many cases.
> If i just look at people in my lab, they want to open the file, they 
> dont want to keep
> track of what file format that particular sequence was in, and so on.
> 
> So, even if file format guessing is bad, people are going to write it, 
> and imho its
> better to have one centralised good, known to work file guesser, then 
> several
> different implementations that differ in each persons own application.
> 
> So, my suggestion is to start with using the deprecated version thats in 
> biojava, if
> it gets removed you can easily just copy that small part of the code 
> into your own
> application, or as an external little jarfile.
> 
> >3) How robust are the I/O-classes for different formats? The
> >test-library provided is rather short in my opinion and my first test
> >broke since there was a space in the wrong position...
> >
> >4) What are the capabilities for multiple sequence alignment in Biojava?
> >Is it limited to parse results into Biojava objects (as in BIA) or does
> >it contain any stable MSA-implementations? Due to BioJavas size it is
> >not easy to get an overview of the current capabilities and the standard
> >of different parts.
> >  
> >
> 
> There is some support for multiple alignments in biojava. The Alignment 
> interface
> and implementations happily handle multiple alignments. And you can 
> choose how
> to interpret it, either as SymbolList over a crossproduct alphabet, or 
> as individual
> sequences accessable by some label.
> 
> There is a basic framework for handling multiple alignment formats in 
> the biojava
> org.biojava.bio.seq.io package. It currently only implements two 
> formats, FASTA
> and MSF. Most programs seem to be able to generate multiple alignment 
> output
> into either FASTA or MSF format so you should be able to get the results 
> into
> biojava.
> 
> >5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
> >public web-services running for this?
> >
> >  
> >
> I have been told by greater deities that implementing BLAST in java is 
> hard, because
> the blast algorithm makes heavy use of low level data structures, 
> pointers ? and similar
> things that are very hard to implement and controll in java. So the 
> resulting implementation
> would most likely run pretty darn slow, and not do what you want.
> 
> Depending on what you want to do with BLAST, the biojava SSAHA 
> implementation
> might be something you can use instead ( it works pretty ok on quite 
> conserved sequences,
> but its not realy suited for more divergent sequences )
> 
> When it comes to webservices i just know of a few things, i have not 
> used any of these
> to an large extent, so i cant comment on how well they work for large 
> sequences, big
> jobs and so on.
> 
> http://www.ebi.ac.uk/Tools/webservices/services.html
> http://xml.ddbj.nig.ac.jp/wsdl/index.jsp
> 
> Sadly they all use their own data encoding and service invocation setup, 
> so its pretty darn
> annoying to use.
> 
> 
> >6) Is there some example-code on how to use DAS (as a client)?
> >
> >7) How can I submit an RFE?
> >
> >Sorry for so many questions in one post; I have a lot of catching up to
> >do and was hoping for some guidance. Some answers have probably already
> >been answered in earlier posts but I have not been able to search the
> >archives.
> >
> >Cheers,
> >
> >   .../Ola
> >
> >
> >
> >
> >_______________________________________________
> >Biojava-l mailing list  -  Biojava-l@biojava.org
> >http://biojava.org/mailman/listinfo/biojava-l
> >  
> >
-- 
Ola Spjuth <ola.spjuth@farmbio.uu.se>


From dreher at mpiib-berlin.mpg.de  Tue Nov 29 11:37:12 2005
From: dreher at mpiib-berlin.mpg.de (Felix Dreher)
Date: Tue Nov 29 11:36:02 2005
Subject: [Biojava-l] "error loading ontology terms"
Message-ID: <438C83B8.20101@mpiib-berlin.mpg.de>

Hello,
I'm referring to the BioSQL-Database-Problem, which I described last 
week. I had a problem with the exception "error loading ontology terms" 
when trying to add a Sequence to the DB. At that time I used Biojava1.4.
As I was advised, I installed the latest CVS-update today and now the 
database-update works fine.
Greetings,
Felix


-- 
Felix Dreher
Max-Planck-Institute for Infection Biology
Campus Charit? Mitte
Department of Immunology
Mailing address: Schumannstra?e 21/22
Visitors: Virchowweg 12
10117 Berlin
Germany
Tel.: +49 (0)30 28460-254 / -494
Mobile: +49 (0)163 7542426

From mark.schreiber at novartis.com  Tue Nov 29 20:15:24 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Nov 29 20:20:03 2005
Subject: [Biojava-l] Multiple questions
Message-ID: <OF644526F3.CFD1EFE4-ON482570C9.0006BD77-482570C9.0006E757@EU.novartis.net>

Regarding the format guessing function. It was deprecated cause it cannot 
be gaurenteed to work. However, deprecation might be a bit extreme, 
especially if many people use it. I would propose that we undeprecate it 
and just document a warning saying it may not work. Any objections?

- Mark


Kalle N?slund <kalle.naslund@genpat.uu.se>
Sent by: biojava-l-bounces@portal.open-bio.org
11/29/2005 09:34 PM

 
        To:     Ola Spjuth <ola.spjuth@farmbio.uu.se>
        cc:     biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis)
        Subject:        Re: [Biojava-l] Multiple questions


Ola Spjuth wrote:

>Hi,
>
>I am investigating the usefulness of BioJava as a backend for sequence
>management in Bioclipse (www.bioclipse.net). As a total newbie to
>Biojava, I have read the tutorial, BIA examples, glanced at the API,
>read my first FASTA-sequence and have come up with a few questions:
>
>1) Is it possible to search the Biojava-l archives without having to
>manually browse by month?
>
>2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
>automatically detects file formats or is it necessary to distinguish
>sequence formats externally, i.e. with different file-extensions? If so,
>does anyone know of a complete list of file-extensions that could be
>mapped to a format?
> 
>
There is a deprecated piece of code available, that quite many people 
actualy use
in their code still. Even though it might not be the greatest thing to 
try to auto
guess file format, its the desireable thing to do in many cases.
If i just look at people in my lab, they want to open the file, they 
dont want to keep
track of what file format that particular sequence was in, and so on.

So, even if file format guessing is bad, people are going to write it, 
and imho its
better to have one centralised good, known to work file guesser, then 
several
different implementations that differ in each persons own application.

So, my suggestion is to start with using the deprecated version thats in 
biojava, if
it gets removed you can easily just copy that small part of the code 
into your own
application, or as an external little jarfile.

>3) How robust are the I/O-classes for different formats? The
>test-library provided is rather short in my opinion and my first test
>broke since there was a space in the wrong position...
>
>4) What are the capabilities for multiple sequence alignment in Biojava?
>Is it limited to parse results into Biojava objects (as in BIA) or does
>it contain any stable MSA-implementations? Due to BioJavas size it is
>not easy to get an overview of the current capabilities and the standard
>of different parts.
> 
>

There is some support for multiple alignments in biojava. The Alignment 
interface
and implementations happily handle multiple alignments. And you can 
choose how
to interpret it, either as SymbolList over a crossproduct alphabet, or 
as individual
sequences accessable by some label.

There is a basic framework for handling multiple alignment formats in 
the biojava
org.biojava.bio.seq.io package. It currently only implements two 
formats, FASTA
and MSF. Most programs seem to be able to generate multiple alignment 
output
into either FASTA or MSF format so you should be able to get the results 
into
biojava.

>5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any
>public web-services running for this?
>
> 
>
I have been told by greater deities that implementing BLAST in java is 
hard, because
the blast algorithm makes heavy use of low level data structures, 
pointers ? and similar
things that are very hard to implement and controll in java. So the 
resulting implementation
would most likely run pretty darn slow, and not do what you want.

Depending on what you want to do with BLAST, the biojava SSAHA 
implementation
might be something you can use instead ( it works pretty ok on quite 
conserved sequences,
but its not realy suited for more divergent sequences )

When it comes to webservices i just know of a few things, i have not 
used any of these
to an large extent, so i cant comment on how well they work for large 
sequences, big
jobs and so on.

http://www.ebi.ac.uk/Tools/webservices/services.html
http://xml.ddbj.nig.ac.jp/wsdl/index.jsp

Sadly they all use their own data encoding and service invocation setup, 
so its pretty darn
annoying to use.


>6) Is there some example-code on how to use DAS (as a client)?
>
>7) How can I submit an RFE?
>
>Sorry for so many questions in one post; I have a lot of catching up to
>do and was hoping for some guidance. Some answers have probably already
>been answered in earlier posts but I have not been able to search the
>archives.
>
>Cheers,
>
>   .../Ola
>
>
>
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l@biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
> 
>

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From hollandr at gis.a-star.edu.sg  Wed Nov 30 00:48:50 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Wed Nov 30 00:47:23 2005
Subject: [Biojava-l] Multiple questions
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602894BF6@BIONIC.biopolis.one-north.com>

Hi there,

Seeing as people think it'd be quite useful, I've added some format-guessing functionality to BioJavaX (although I haven't touched the old one).

Here's how you would use it:

	// Load the classes that represent the selection of formats your program expects to receive
	Class.forName("org.biojavax.bio.seq.io.EMBLFormat");
	Class.forName("org.biojavax.bio.seq.io.GenbankFormat");
	Class.forName("org.biojavax.bio.seq.io.FastaFormat");

	// Obtain the default BioJavaX namespace.
	Namespace ns = RichObjectFactory.getDefaultNamespace();

	// Find the file
	File file = new File("myfile.seq");

	// Read the file (indicating that you want to load sequences into the default namespace).
      // BioJavaX will guess the format based only on the selection of format classes that have
 	// previously been loaded either using Class.forName above or by instantiating them elsewhere.
	RichSequenceIterator seqs = RichSequence.IOTools.readFile(file,ns);

	// NB. If you do know the format in advance, you don't need to load the class first, and instead
	// you should just use one of the predefined methods in RichSequence.IOTools, eg.:
	// 	BufferedReader br = new BufferedReader(new FileReader(file))
	//    RichSequenceIterator seqs = RichSequence.IOTools.readFastaDNA(br, ns);
 	
	// Iterate over the sequences
	while (seqs.hasNext()) {
	      RichSequence rs = seqs.nextRichSequence();
		// ... Do something with it here ...
	}

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of Ola Spjuth
> Sent: Tuesday, November 29, 2005 9:55 PM
> To: Kalle N?slund
> Cc: biojava-l@biojava.org
> Subject: Re: [Biojava-l] Multiple questions
> 
> 
> Thanks for all the comments, it should get me started.
> I do not intend to write a file-format guesser (out of my scope) but I
> would like to make an RFE for it here and now. I think many 
> people would
> benefit from it.
> 
> Cheers,
> 
>    .../Ola
> 
> 
> On Tue, 2005-11-29 at 14:34, Kalle N?slund wrote:
> > Ola Spjuth wrote:
> > 
> > >Hi,
> > >
> > >I am investigating the usefulness of BioJava as a backend 
> for sequence
> > >management in Bioclipse (www.bioclipse.net). As a total newbie to
> > >Biojava, I have read the tutorial, BIA examples, glanced 
> at the API,
> > >read my first FASTA-sequence and have come up with a few questions:
> > >
> > >1) Is it possible to search the Biojava-l archives without 
> having to
> > >manually browse by month?
> > >
> > >2) Is there a wrapper for SequenceIO.fileToBiojava(..) that
> > >automatically detects file formats or is it necessary to 
> distinguish
> > >sequence formats externally, i.e. with different 
> file-extensions? If so,
> > >does anyone know of a complete list of file-extensions 
> that could be
> > >mapped to a format?
> > >  
> > >
> > There is a deprecated piece of code available, that quite 
> many people 
> > actualy use
> > in their code still. Even though it might not be the 
> greatest thing to 
> > try to auto
> > guess file format, its the desireable thing to do in many cases.
> > If i just look at people in my lab, they want to open the 
> file, they 
> > dont want to keep
> > track of what file format that particular sequence was in, 
> and so on.
> > 
> > So, even if file format guessing is bad, people are going 
> to write it, 
> > and imho its
> > better to have one centralised good, known to work file 
> guesser, then 
> > several
> > different implementations that differ in each persons own 
> application.
> > 
> > So, my suggestion is to start with using the deprecated 
> version thats in 
> > biojava, if
> > it gets removed you can easily just copy that small part of 
> the code 
> > into your own
> > application, or as an external little jarfile.
> > 
> > >3) How robust are the I/O-classes for different formats? The
> > >test-library provided is rather short in my opinion and my 
> first test
> > >broke since there was a space in the wrong position...
> > >
> > >4) What are the capabilities for multiple sequence 
> alignment in Biojava?
> > >Is it limited to parse results into Biojava objects (as in 
> BIA) or does
> > >it contain any stable MSA-implementations? Due to BioJavas 
> size it is
> > >not easy to get an overview of the current capabilities 
> and the standard
> > >of different parts.
> > >  
> > >
> > 
> > There is some support for multiple alignments in biojava. 
> The Alignment 
> > interface
> > and implementations happily handle multiple alignments. And you can 
> > choose how
> > to interpret it, either as SymbolList over a crossproduct 
> alphabet, or 
> > as individual
> > sequences accessable by some label.
> > 
> > There is a basic framework for handling multiple alignment 
> formats in 
> > the biojava
> > org.biojava.bio.seq.io package. It currently only implements two 
> > formats, FASTA
> > and MSF. Most programs seem to be able to generate multiple 
> alignment 
> > output
> > into either FASTA or MSF format so you should be able to 
> get the results 
> > into
> > biojava.
> > 
> > >5) As a novice, has anyone implemented BLAST or CLUSTALW 
> in Java? Any
> > >public web-services running for this?
> > >
> > >  
> > >
> > I have been told by greater deities that implementing BLAST 
> in java is 
> > hard, because
> > the blast algorithm makes heavy use of low level data structures, 
> > pointers ? and similar
> > things that are very hard to implement and controll in java. So the 
> > resulting implementation
> > would most likely run pretty darn slow, and not do what you want.
> > 
> > Depending on what you want to do with BLAST, the biojava SSAHA 
> > implementation
> > might be something you can use instead ( it works pretty ok 
> on quite 
> > conserved sequences,
> > but its not realy suited for more divergent sequences )
> > 
> > When it comes to webservices i just know of a few things, i 
> have not 
> > used any of these
> > to an large extent, so i cant comment on how well they work 
> for large 
> > sequences, big
> > jobs and so on.
> > 
> > http://www.ebi.ac.uk/Tools/webservices/services.html
> > http://xml.ddbj.nig.ac.jp/wsdl/index.jsp
> > 
> > Sadly they all use their own data encoding and service 
> invocation setup, 
> > so its pretty darn
> > annoying to use.
> > 
> > 
> > >6) Is there some example-code on how to use DAS (as a client)?
> > >
> > >7) How can I submit an RFE?
> > >
> > >Sorry for so many questions in one post; I have a lot of 
> catching up to
> > >do and was hoping for some guidance. Some answers have 
> probably already
> > >been answered in earlier posts but I have not been able to 
> search the
> > >archives.
> > >
> > >Cheers,
> > >
> > >   .../Ola
> > >
> > >
> > >
> > >
> > >_______________________________________________
> > >Biojava-l mailing list  -  Biojava-l@biojava.org
> > >http://biojava.org/mailman/listinfo/biojava-l
> > >  
> > >
> -- 
> Ola Spjuth <ola.spjuth@farmbio.uu.se>
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From hotafin at gmail.com  Tue Nov 29 11:25:25 2005
From: hotafin at gmail.com (Tamas Horvath)
Date: Wed Nov 30 01:18:04 2005
Subject: [Biojava-l] sturcture.io
Message-ID: <c343d7080511290825j45dde49ei1e7e9a679e039c0b@mail.gmail.com>

I've got an ArrayList<String> object containing a PDB file's information.How may I feed it to the structure parser? As far as I could see, it onlyaccepts BufferdReader or imputStream...
From hollandr at gis.a-star.edu.sg  Wed Nov 30 01:25:44 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Wed Nov 30 01:24:06 2005
Subject: [Biojava-l] sturcture.io
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602894C02@BIONIC.biopolis.one-north.com>

Combine into a single String separated with newline characters, then see
java.io.StringReader - this provides a Reader, which you can then wrap
in a BufferedReader.

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Tamas Horvath
> Sent: Wednesday, November 30, 2005 12:25 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] sturcture.io
> 
> 
> I've got an ArrayList<String> object containing a PDB file's 
> information.How may I feed it to the structure parser? As far 
> as I could see, it onlyaccepts BufferdReader or imputStream...
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From ap3 at sanger.ac.uk  Wed Nov 30 04:51:20 2005
From: ap3 at sanger.ac.uk (Andreas Prlic)
Date: Wed Nov 30 04:48:06 2005
Subject: [Biojava-l] sturcture.io
In-Reply-To: <c343d7080511290825j45dde49ei1e7e9a679e039c0b@mail.gmail.com>
References: <c343d7080511290825j45dde49ei1e7e9a679e039c0b@mail.gmail.com>
Message-ID: <6e5a5ad25a37d12836f4e0059fac4c74@sanger.ac.uk>

Hi Tamas,

In case you are working with a PDB file that is located
somewhere on  your hard disk, you could use the code from below
to parse it in.

Cheers,
Andreas


String filename =  "path/to/pdbfile.ent" ;

  PDBFileReader pdbreader = new PDBFileReader();

  try{
       	Structure struc = pdbreader.getStructure(filename);
	System.out.println(struc);
  } catch (Exception e) {
  	e.printStackTrace();
  }


On 29 Nov 2005, at 16:25, Tamas Horvath wrote:

> I've got an ArrayList<String> object containing a PDB file's 
> information.How may I feed it to the structure parser? As far as I 
> could see, it onlyaccepts BufferdReader or imputStream...


> ----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891

From k.parveen at gmail.com  Wed Nov 30 05:34:58 2005
From: k.parveen at gmail.com (Parveen k)
Date: Wed Nov 30 06:38:50 2005
Subject: [Biojava-l] help on blast
Message-ID: <1373ba70511300234t39c86bf0h6235564233437374@mail.gmail.com>

Hi
   I'm pretty new to bioinformatics.i have to incorparate balst in my
applet.so that when the client enters the sequence ,it should perform the
blast search against the database we have and return the result.can anyone
guide me in this regard.

--
Regards
Parveen K

YOU MAY SAY I AM A DREAMER, BUT  I AM NOT THE ONLY ONE.
I HOPE SOMEDAY YOU WILL JOIN US, AND THE WORLD WILL FOLLOW US.
                                  - JOHN LENNON

From hotafin at gmail.com  Wed Nov 30 08:47:49 2005
From: hotafin at gmail.com (Tamas Horvath)
Date: Wed Nov 30 08:45:39 2005
Subject: [Biojava-l] sturcture.io
In-Reply-To: <6e5a5ad25a37d12836f4e0059fac4c74@sanger.ac.uk>
References: <c343d7080511290825j45dde49ei1e7e9a679e039c0b@mail.gmail.com>
	<6e5a5ad25a37d12836f4e0059fac4c74@sanger.ac.uk>
Message-ID: <c343d7080511300547s69f16936vb6d2993bc4971dd3@mail.gmail.com>

In my case, I had a bunch of pdb files in a jar archive, so here's the codeI try to use:
   public static InputStream get_pdbinputstream(String fileName,Stringpdb_id) {          System.out.println(fileName);          InputStream returnstream = null;          JarFile jarFile = null;            try {               jarFile = new JarFile(fileName);               for (Enumeration e = jarFile.entries(); e.hasMoreElements();) {
                   JarEntry jarEntry = (JarEntry) e.nextElement();
                   if (jarEntry.isDirectory()) continue;
                   String pdbid = jarEntry.getName();                   pdbid = pdbid.substring(0, 4).toUpperCase();                   System.out.println(pdbid);                   if (pdbid.intern() != pdb_id) continue;
                   returnstream = jarFile.getInputStream(jarEntry);               }            } catch (IOException ioe) {               System.out.println("An IOException occurred: " +ioe.getMessage());            } finally {               if (jarFile != null) {                  try { jarFile.close(); } catch (IOException ioe) {}               }            }            return returnstream;   }
   public static Object parse_pdb(String pdb_datafile, String pdb_id) {       Structure structure = null;       pdb_inputstream = Utils.get_pdbinputstream(pdb_datafile,pdb_id);       if (pdb_inputstream != null) structure =org.biojava.bio.structure.io.PDBFileParser.parsePDBFile(pdb_inputstream);
       return structure;   }
The 1st function is supposed to get the desired InputStream from thespecified archive file, while the second supposed to generate the structure.However:
"Utils.java": non-static method parsePDBFile(java.io.InputStream) cannot bereferenced from a static context at line 1132, column 92as it kindly sais...
Is there a solution of this problem?

On 11/30/05, Andreas Prlic <ap3@sanger.ac.uk> wrote:>> Hi Tamas,>> In case you are working with a PDB file that is located> somewhere on  your hard disk, you could use the code from below> to parse it in.>> Cheers,> Andreas>>> String filename =  "path/to/pdbfile.ent" ;>>   PDBFileReader pdbreader = new PDBFileReader();>>   try{>         Structure struc = pdbreader.getStructure(filename);>         System.out.println(struc);>   } catch (Exception e) {>         e.printStackTrace();>   }>>>> On 29 Nov 2005, at 16:25, Tamas Horvath wrote:>> > I've got an ArrayList<String> object containing a PDB file's> > information.How may I feed it to the structure parser? As far as I> > could see, it onlyaccepts BufferdReader or imputStream...>>> > ---------------------------------------------------------------------->> Andreas Prlic      Wellcome Trust Sanger Institute>                                Hinxton, Cambridge CB10 1SA, UK>                          +44 (0) 1223 49 6891>> _______________________________________________> Biojava-l mailing list  -  Biojava-l@biojava.org> http://biojava.org/mailman/listinfo/biojava-l>
From hotafin at gmail.com  Wed Nov 30 09:32:11 2005
From: hotafin at gmail.com (Tamas Horvath)
Date: Wed Nov 30 09:31:48 2005
Subject: [Biojava-l] sturcture.io
In-Reply-To: <c343d7080511300547s69f16936vb6d2993bc4971dd3@mail.gmail.com>
References: <c343d7080511290825j45dde49ei1e7e9a679e039c0b@mail.gmail.com>
	<6e5a5ad25a37d12836f4e0059fac4c74@sanger.ac.uk>
	<c343d7080511300547s69f16936vb6d2993bc4971dd3@mail.gmail.com>
Message-ID: <c343d7080511300632i60effd67tb2387bb2584761c2@mail.gmail.com>

Oh, I forgot I should instanciate a PDBFileParser, so the following codeworks fine:
public static Object parse_pdb(String pdb_datafile, String pdb_id) {       Structure structure = null;       pdb_inputstream = Utils.get_pdbinputstream(pdb_datafile,pdb_id);       PDBFileParser pdbfileparser = new PDBFileParser();       try {           if (pdb_inputstream != null)               structure = pdbfileparser.parsePDBFile(pdb_inputstream);       }       catch (IOException ioe) {System.out.println(ioe);}
       return structure;   }
On 11/30/05, Tamas Horvath <hotafin@gmail.com> wrote:>> In my case, I had a bunch of pdb files in a jar archive, so here's the> code I try to use:>>    public static InputStream get_pdbinputstream(String fileName,String> pdb_id) {>           System.out.println(fileName);>           InputStream returnstream = null;>           JarFile jarFile = null;>             try {>                jarFile = new JarFile(fileName);>                for (Enumeration e = jarFile.entries(); e.hasMoreElements();) {>>                    JarEntry jarEntry = (JarEntry) e.nextElement();>>                    if (jarEntry.isDirectory()) continue;>>                    String pdbid = jarEntry.getName();>                    pdbid = pdbid.substring(0, 4).toUpperCase();>                    System.out.println(pdbid);>                    if (pdbid.intern() != pdb_id) continue;>>                    returnstream = jarFile.getInputStream(jarEntry);>                }>             } catch (IOException ioe) {>                System.out.println("An IOException occurred: " +> ioe.getMessage());>             } finally {>                if (jarFile != null) {>                   try { jarFile.close(); } catch (IOException ioe) {}>                }>             }>             return returnstream;>    }>>    public static Object parse_pdb(String pdb_datafile, String pdb_id) {>        Structure structure = null;>        pdb_inputstream = Utils.get_pdbinputstream(pdb_datafile,pdb_id);>        if (pdb_inputstream != null) structure = org.biojava.bio.structure.io> .PDBFileParser.parsePDBFile(pdb_inputstream);>        return structure;>    }>> The 1st function is supposed to get the desired InputStream from the> specified archive file, while the second supposed to generate the structure.> However:>> "Utils.java": non-static method parsePDBFile(java.io.InputStream) cannot> be referenced from a static context at line 1132, column 92> as it kindly sais...>> Is there a solution of this problem?>>> On 11/30/05, Andreas Prlic < ap3@sanger.ac.uk> wrote:>> > Hi Tamas,> >> > In case you are working with a PDB file that is located> > somewhere on  your hard disk, you could use the code from below> > to parse it in.> >> > Cheers,> > Andreas> >> >> > String filename =  "path/to/pdbfile.ent" ;> >> >   PDBFileReader pdbreader = new PDBFileReader();> >> >   try{> >         Structure struc = pdbreader.getStructure(filename);> >         System.out.println(struc);> >   } catch (Exception e) {> >         e.printStackTrace();> >   }> >> >> >> > On 29 Nov 2005, at 16:25, Tamas Horvath wrote:> >> > > I've got an ArrayList<String> object containing a PDB file's> > > information.How may I feed it to the structure parser? As far as I> > > could see, it onlyaccepts BufferdReader or imputStream...> >> >> > > ----------------------------------------------------------------------> >> > Andreas Prlic      Wellcome Trust Sanger Institute> >                                Hinxton, Cambridge CB10 1SA, UK> >                          +44 (0) 1223 49 6891> >> > _______________________________________________> > Biojava-l mailing list  -  Biojava-l@biojava.org> > http://biojava.org/mailman/listinfo/biojava-l> >>>
From dreher at mpiib-berlin.mpg.de  Wed Nov 30 10:06:54 2005
From: dreher at mpiib-berlin.mpg.de (Felix Dreher)
Date: Wed Nov 30 10:06:02 2005
Subject: [Biojava-l] Problem with downloading Genbank-sequence
Message-ID: <438DC00E.2080301@mpiib-berlin.mpg.de>

Hello,

I used to download DNA-sequences from Genbank with the 
'getSequence'-method from the class 'GenbankSequenceDB' (Biojava 1.4). 
After I installed the latest cvs-version, it seems to be not working 
anymore.

This is where I call the method:


public class GenbankDownload {
    static private GenbankSequenceDB gbDb= new GenbankSequenceDB();
    static public Sequence loadGenBankSequence(String id){
        try{
            return gbDb.getSequence(id);  
        }catch(Exception e){
            return null;
        }
    }


This is the exception report:


java.lang.ExceptionInInitializerError

    org.biojava.bio.seq.FeatureFilter.(FeatureFilter.java:1813)
    org.biojava.bio.seq.impl.SimpleSequence.getFeatureHolder(SimpleSequence.java:144)
    org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:224)
    org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase.java:175)
    org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilder.java:103)
    org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFilter.java:99)
    org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:102)
    org.biojava.bio.seq.db.GenbankSequenceDB.getSequence(GenbankSequenceDB.java:130)
    rnai.GenbankDownload.loadGenBankSequence(GenbankDownload.java:23)
    rnai.seq_input2.prerender(seq_input2.java:312)
    com.sun.web.ui.appbase.faces.ViewHandlerImpl.prerender(ViewHandlerImpl.java:788)
    com.sun.web.ui.appbase.faces.ViewHandlerImpl.renderView(ViewHandlerImpl.java:282)
    com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:87)
    com.sun.faces.lifecycle.LifecycleImpl.phase(LifecycleImpl.java:221)
    com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java:117)
    javax.faces.webapp.FacesServlet.service(FacesServlet.java:198)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    java.lang.reflect.Method.invoke(Method.java:585)
    org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:249)
    java.security.AccessController.doPrivileged(Native Method)
    javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
    org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:282)
    org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:165)
    java.security.AccessController.doPrivileged(Native Method)
    com.sun.web.ui.util.UploadFilter.doFilter(UploadFilter.java:179)


Any help would be appreciated!
Thank you,

Felix


-- 
Felix Dreher
Max-Planck-Institute for Infection Biology
Campus Charit? Mitte
Department of Immunology
Mailing address: Schumannstra?e 21/22
Visitors: Virchowweg 12
10117 Berlin
Germany
Tel.: +49 (0)30 28460-254 / -494
Mobile: +49 (0)163 7542426

From td2 at sanger.ac.uk  Wed Nov 30 10:33:30 2005
From: td2 at sanger.ac.uk (Thomas Down)
Date: Wed Nov 30 10:54:59 2005
Subject: [Biojava-l] Problem with downloading Genbank-sequence
In-Reply-To: <438DC00E.2080301@mpiib-berlin.mpg.de>
References: <438DC00E.2080301@mpiib-berlin.mpg.de>
Message-ID: <F2C17091-29D2-4E31-9C48-EC540FC617AF@sanger.ac.uk>

Hi Felix,

This doesn't look like a problem specific to GenbankSequenceDB  
itself, but a problem with initialising some core BioJava machinary  
(from the exception you report, it looks like the static initializer  
for the FeatureFilter.OnlyChildren class is failing -- I assume the  
real problem is occuring somewhere in WalkerFactory, but the exact  
problem isn't being reporter in your exception.

What version of Java are you using?

Do you have an up-to-date bytecode.jar somewhere where it will be  
picked up (I notice you're using JavaServer Faces, I'm afraid I don't  
have any experience with this -- is there anyone on the list that does?)

What happens if you put a line like:

        WalkerFactory.getInstance().addTypeWithParent 
(FeatureFilter.OnlyChildren.class);

somewhere in your code?  I'd expect this to fail, too, but hopefully  
it with give a more informative exception.

              Thomas.

On 30 Nov 2005, at 15:06, Felix Dreher wrote:

> Hello,
>
> I used to download DNA-sequences from Genbank with the  
> 'getSequence'-method from the class 'GenbankSequenceDB' (Biojava  
> 1.4). After I installed the latest cvs-version, it seems to be not  
> working anymore.
>
> This is where I call the method:
>
>
> public class GenbankDownload {
>    static private GenbankSequenceDB gbDb= new GenbankSequenceDB();
>    static public Sequence loadGenBankSequence(String id){
>        try{
>            return gbDb.getSequence(id);         }catch(Exception e){
>            return null;
>        }
>    }
>
>
> This is the exception report:
>
>
> java.lang.ExceptionInInitializerError
>
>    org.biojava.bio.seq.FeatureFilter.(FeatureFilter.java:1813)
>    org.biojava.bio.seq.impl.SimpleSequence.getFeatureHolder 
> (SimpleSequence.java:144)
>    org.biojava.bio.seq.impl.SimpleSequence.createFeature 
> (SimpleSequence.java:224)
>    org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence 
> (SequenceBuilderBase.java:175)
>    org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence 
> (SmartSequenceBuilder.java:103)
>    org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence 
> (SequenceBuilderFilter.java:99)
>    org.biojava.bio.seq.io.StreamReader.nextSequence 
> (StreamReader.java:102)
>    org.biojava.bio.seq.db.GenbankSequenceDB.getSequence 
> (GenbankSequenceDB.java:130)
>    rnai.GenbankDownload.loadGenBankSequence(GenbankDownload.java:23)
>    rnai.seq_input2.prerender(seq_input2.java:312)
>    com.sun.web.ui.appbase.faces.ViewHandlerImpl.prerender 
> (ViewHandlerImpl.java:788)
>    com.sun.web.ui.appbase.faces.ViewHandlerImpl.renderView 
> (ViewHandlerImpl.java:282)
>    com.sun.faces.lifecycle.RenderResponsePhase.execute 
> (RenderResponsePhase.java:87)
>    com.sun.faces.lifecycle.LifecycleImpl.phase(LifecycleImpl.java:221)
>    com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java: 
> 117)
>    javax.faces.webapp.FacesServlet.service(FacesServlet.java:198)
>    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:39)
>    sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:25)
>    java.lang.reflect.Method.invoke(Method.java:585)
>    org.apache.catalina.security.SecurityUtil$1.run 
> (SecurityUtil.java:249)
>    java.security.AccessController.doPrivileged(Native Method)
>    javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
>    org.apache.catalina.security.SecurityUtil.execute 
> (SecurityUtil.java:282)
>    org.apache.catalina.security.SecurityUtil.doAsPrivilege 
> (SecurityUtil.java:165)
>    java.security.AccessController.doPrivileged(Native Method)
>    com.sun.web.ui.util.UploadFilter.doFilter(UploadFilter.java:179)
>
>
> Any help would be appreciated!
> Thank you,
>
> Felix
>
>
> -- 
> Felix Dreher
> Max-Planck-Institute for Infection Biology
> Campus Charit? Mitte
> Department of Immunology
> Mailing address: Schumannstra?e 21/22
> Visitors: Virchowweg 12
> 10117 Berlin
> Germany
> Tel.: +49 (0)30 28460-254 / -494
> Mobile: +49 (0)163 7542426
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l


From fpepin at cs.mcgill.ca  Wed Nov 30 10:44:29 2005
From: fpepin at cs.mcgill.ca (Francois Pepin)
Date: Wed Nov 30 11:13:54 2005
Subject: [Biojava-l] help on blast
In-Reply-To: <1373ba70511300234t39c86bf0h6235564233437374@mail.gmail.com>
References: <1373ba70511300234t39c86bf0h6235564233437374@mail.gmail.com>
Message-ID: <1133365469.9053.270.camel@elm.mcb.mcgill.ca>

Hi Parveen,

This might not be as easy as you might like.

The applet runs on the client, so you need the applet to communicate
remotely to the server to send the sequence. Then the easiest way would
be for the server to call blast on the command-line with the sequence
(which is pretty easy), parse the result and send it back to the client
applet.

I think RMI could do this, but I've never had to play with it.

Anyone has a better way to do this?

Francois

On Wed, 2005-11-30 at 16:04 +0530, Parveen k wrote:
> Hi
>    I'm pretty new to bioinformatics.i have to incorparate balst in my
> applet.so that when the client enters the sequence ,it should perform the
> blast search against the database we have and return the result.can anyone
> guide me in this regard.
> 
> --
> Regards
> Parveen K
> 
> YOU MAY SAY I AM A DREAMER, BUT  I AM NOT THE ONLY ONE.
> I HOPE SOMEDAY YOU WILL JOIN US, AND THE WORLD WILL FOLLOW US.
>                                   - JOHN LENNON
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From dreher at mpiib-berlin.mpg.de  Wed Nov 30 11:20:13 2005
From: dreher at mpiib-berlin.mpg.de (Felix Dreher)
Date: Wed Nov 30 11:19:05 2005
Subject: [Biojava-l] Problem with downloading Genbank-sequence
In-Reply-To: <F2C17091-29D2-4E31-9C48-EC540FC617AF@sanger.ac.uk>
References: <438DC00E.2080301@mpiib-berlin.mpg.de>
	<F2C17091-29D2-4E31-9C48-EC540FC617AF@sanger.ac.uk>
Message-ID: <438DD13D.5080107@mpiib-berlin.mpg.de>

Hi Thomas,

I use "Java Studio Creator2 EarlyAccess2" as IDE, which runs with Java 
1.5 as platform.
Further I have an up-to-date bytecode.jar (it's in the cvs-library I 
just installed a few days ago).
When I ran the application with the additional line you sent, I got 
another exception. I hope this time maybe it's more useful ....:

java.lang.ExceptionInInitializerError

    org.biojava.bio.seq.FeatureFilter.(FeatureFilter.java:1813)
     java.lang.Class.forName0(Native Method)
     java.lang.Class.forName(Class.java:164)
    org.biojava.utils.walker.WalkerFactory.class$(WalkerFactory.java:40)
    org.biojava.utils.walker.WalkerFactory.getInstance(WalkerFactory.java:40)

    rnai.seq_input2.prerender(seq_input2.java:283)
    com.sun.web.ui.appbase.faces.ViewHandlerImpl.prerender(ViewHandlerImpl.java:788)
    com.sun.web.ui.appbase.faces.ViewHandlerImpl.renderView(ViewHandlerImpl.java:282)
    com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:87)
    com.sun.faces.lifecycle.LifecycleImpl.phase(LifecycleImpl.java:221)
    com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java:117)
    javax.faces.webapp.FacesServlet.service(FacesServlet.java:198)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    java.lang.reflect.Method.invoke(Method.java:585)
    org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:249)
    java.security.AccessController.doPrivileged(Native Method)
    javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
    org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:282)
    org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:165)
    java.security.AccessController.doPrivileged(Native Method)
    com.sun.web.ui.util.UploadFilter.doFilter(UploadFilter.java:179)


Greetings,
Felix


Thomas Down wrote:

> Hi Felix,
>
> This doesn't look like a problem specific to GenbankSequenceDB  
> itself, but a problem with initialising some core BioJava machinary  
> (from the exception you report, it looks like the static initializer  
> for the FeatureFilter.OnlyChildren class is failing -- I assume the  
> real problem is occuring somewhere in WalkerFactory, but the exact  
> problem isn't being reporter in your exception.
>
> What version of Java are you using?
>
> Do you have an up-to-date bytecode.jar somewhere where it will be  
> picked up (I notice you're using JavaServer Faces, I'm afraid I don't  
> have any experience with this -- is there anyone on the list that does?)
>
> What happens if you put a line like:
>
>        WalkerFactory.getInstance().addTypeWithParent 
> (FeatureFilter.OnlyChildren.class);
>
> somewhere in your code?  I'd expect this to fail, too, but hopefully  
> it with give a more informative exception.
>
>              Thomas.
>
> On 30 Nov 2005, at 15:06, Felix Dreher wrote:
>
>> Hello,
>>
>> I used to download DNA-sequences from Genbank with the  
>> 'getSequence'-method from the class 'GenbankSequenceDB' (Biojava  
>> 1.4). After I installed the latest cvs-version, it seems to be not  
>> working anymore.
>>
>> This is where I call the method:
>>
>>
>> public class GenbankDownload {
>>    static private GenbankSequenceDB gbDb= new GenbankSequenceDB();
>>    static public Sequence loadGenBankSequence(String id){
>>        try{
>>            return gbDb.getSequence(id);         }catch(Exception e){
>>            return null;
>>        }
>>    }
>>
>>
>> This is the exception report:
>>
>>
>> java.lang.ExceptionInInitializerError
>>
>>    org.biojava.bio.seq.FeatureFilter.(FeatureFilter.java:1813)
>>    org.biojava.bio.seq.impl.SimpleSequence.getFeatureHolder 
>> (SimpleSequence.java:144)
>>    org.biojava.bio.seq.impl.SimpleSequence.createFeature 
>> (SimpleSequence.java:224)
>>    org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence 
>> (SequenceBuilderBase.java:175)
>>    org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence 
>> (SmartSequenceBuilder.java:103)
>>    org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence 
>> (SequenceBuilderFilter.java:99)
>>    org.biojava.bio.seq.io.StreamReader.nextSequence 
>> (StreamReader.java:102)
>>    org.biojava.bio.seq.db.GenbankSequenceDB.getSequence 
>> (GenbankSequenceDB.java:130)
>>    rnai.GenbankDownload.loadGenBankSequence(GenbankDownload.java:23)
>>    rnai.seq_input2.prerender(seq_input2.java:312)
>>    com.sun.web.ui.appbase.faces.ViewHandlerImpl.prerender 
>> (ViewHandlerImpl.java:788)
>>    com.sun.web.ui.appbase.faces.ViewHandlerImpl.renderView 
>> (ViewHandlerImpl.java:282)
>>    com.sun.faces.lifecycle.RenderResponsePhase.execute 
>> (RenderResponsePhase.java:87)
>>    com.sun.faces.lifecycle.LifecycleImpl.phase(LifecycleImpl.java:221)
>>    com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java: 117)
>>    javax.faces.webapp.FacesServlet.service(FacesServlet.java:198)
>>    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>    sun.reflect.NativeMethodAccessorImpl.invoke 
>> (NativeMethodAccessorImpl.java:39)
>>    sun.reflect.DelegatingMethodAccessorImpl.invoke 
>> (DelegatingMethodAccessorImpl.java:25)
>>    java.lang.reflect.Method.invoke(Method.java:585)
>>    org.apache.catalina.security.SecurityUtil$1.run 
>> (SecurityUtil.java:249)
>>    java.security.AccessController.doPrivileged(Native Method)
>>    javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
>>    org.apache.catalina.security.SecurityUtil.execute 
>> (SecurityUtil.java:282)
>>    org.apache.catalina.security.SecurityUtil.doAsPrivilege 
>> (SecurityUtil.java:165)
>>    java.security.AccessController.doPrivileged(Native Method)
>>    com.sun.web.ui.util.UploadFilter.doFilter(UploadFilter.java:179)
>>
>>
>> Any help would be appreciated!
>> Thank you,
>>
>> Felix
>>
>>
>> -- 
>> Felix Dreher
>> Max-Planck-Institute for Infection Biology
>> Campus Charit? Mitte
>> Department of Immunology
>> Mailing address: Schumannstra?e 21/22
>> Visitors: Virchowweg 12
>> 10117 Berlin
>> Germany
>> Tel.: +49 (0)30 28460-254 / -494
>> Mobile: +49 (0)163 7542426
>>
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l@biojava.org
>> http://biojava.org/mailman/listinfo/biojava-l
>
>


-- 
Felix Dreher
Max-Planck-Institute for Infection Biology
Campus Charit? Mitte
Department of Immunology
Mailing address: Schumannstra?e 21/22
Visitors: Virchowweg 12
10117 Berlin
Germany
Tel.: +49 (0)30 28460-254 / -494
Mobile: +49 (0)163 7542426

From hotafin at gmail.com  Wed Nov 30 11:21:23 2005
From: hotafin at gmail.com (Tamas Horvath)
Date: Wed Nov 30 11:19:08 2005
Subject: [Biojava-l] modify structure
Message-ID: <c343d7080511300821y7258c6bcs86a1deb8c6affa56@mail.gmail.com>

Is there any way to modify a protein structure by modifying the contents ofthe Structure object?In short, I have a Structure object, parsed from a pdb file, and I want tointroduce point mutations to it, and save the modified structure to a pdbfile for further analysis... (I intend to use gromacs for instance if itmatters)...
From trodkey at rice.edu  Wed Nov 30 11:16:13 2005
From: trodkey at rice.edu (Travis Rodkey)
Date: Wed Nov 30 11:37:01 2005
Subject: [Biojava-l] Install question: classpath and use of jar files in
	compiling TestEmbl.java
Message-ID: <438DD04D.5060801@rice.edu>

Hi - I am new to biojava and am trying to get it set up on Red Hat EE 3 
running on a PC.   I have java 1.4.2 and jdk1.4.2.0 installed.

I downloaded the necessary jar files (biojava1.4.jar, bytecode-0.92.jar, 
commons-cli.jar, commons-collections-2.1.jar, commons-dbcp-1.1.jar, 
commons-pool-1.1.jar) and placed them a /homes/trodkey/biojava directory 
as per the "Getting Started" section.

I then fixed the classpath variable to "setenv CLASSPATH 
/homes/trodkey/biojava/biojava-1.4.jar:/homes/trodkey/biojava/bytecode-0.92.jar:
/homes/trodkey/biojava/commons-cli.jar:/homes/trodkey/biojava/commons-collections-2.1.jar:
/homes/trodkey/biojava/commons-dbcp-1.1.jar:/homes/trodkey/biojava/commons-pool-1.1.jar:."
as per the instructions.

I then tried to compile TestEmbl.java as suggested in the "Getting 
Started" section by using "javac 
/homes/trodkey/biojava/demos/seq/TestEmbl.java", and got the error

"/homes/trodkey/biojava/demos/seq/TestEmbl.java:5: package 
homes.trodkey.biojava.org.biojava.bio does not exist", which was the 
first of a string of 15 errors.

Can anyone help me out? 

Thanks!

-Travis Rodkey

From heuermh at acm.org  Wed Nov 30 11:15:22 2005
From: heuermh at acm.org (Michael Heuer)
Date: Wed Nov 30 11:42:41 2005
Subject: [Biojava-l] Problem with downloading Genbank-sequence
In-Reply-To: <F2C17091-29D2-4E31-9C48-EC540FC617AF@sanger.ac.uk>
Message-ID: <Pine.GSO.4.44.0511301111210.27620-100000@shell3.shore.net>


Thomas Down wrote:

> Do you have an up-to-date bytecode.jar somewhere where it will be
> picked up ...

An off-topic but related question -- is the bytecode library used anywhere
other than in biojava proper?  It might make things one step easier to
merge the bytecode package (back?) into the main codebase.

   michael

From hotafin at gmail.com  Wed Nov 30 11:54:31 2005
From: hotafin at gmail.com (Tamas Horvath)
Date: Wed Nov 30 11:52:20 2005
Subject: [Biojava-l] Install question: classpath and use of jar files in
	compiling TestEmbl.java
In-Reply-To: <438DD04D.5060801@rice.edu>
References: <438DD04D.5060801@rice.edu>
Message-ID: <c343d7080511300854x4f9b980bm7b9ce2ce7d4d3a96@mail.gmail.com>

the easiest way is to put the jar files to the jdkxxx/jre/lib/extU may have to link them to your jre's same directory (shuld u use differentjre then the jdk's)
On 11/30/05, Travis Rodkey <trodkey@rice.edu> wrote:>> Hi - I am new to biojava and am trying to get it set up on Red Hat EE 3> running on a PC.   I have java 1.4.2 and jdk1.4.2.0 installed.>> I downloaded the necessary jar files (biojava1.4.jar, bytecode-0.92.jar,> commons-cli.jar, commons-collections-2.1.jar, commons-dbcp-1.1.jar,> commons-pool-1.1.jar) and placed them a /homes/trodkey/biojava directory> as per the "Getting Started" section.>> I then fixed the classpath variable to "setenv CLASSPATH> /homes/trodkey/biojava/biojava-> 1.4.jar:/homes/trodkey/biojava/bytecode-0.92.jar:> /homes/trodkey/biojava/commons-> cli.jar:/homes/trodkey/biojava/commons-collections-2.1.jar:> /homes/trodkey/biojava/commons-> dbcp-1.1.jar:/homes/trodkey/biojava/commons-pool-1.1.jar:."> as per the instructions.>> I then tried to compile TestEmbl.java as suggested in the "Getting> Started" section by using "javac> /homes/trodkey/biojava/demos/seq/TestEmbl.java", and got the error>> "/homes/trodkey/biojava/demos/seq/TestEmbl.java:5: package> homes.trodkey.biojava.org.biojava.bio does not exist", which was the> first of a string of 15 errors.>> Can anyone help me out?>> Thanks!>> -Travis Rodkey>> _______________________________________________> Biojava-l mailing list  -  Biojava-l@biojava.org> http://biojava.org/mailman/listinfo/biojava-l>
From wetrull at yahoo.com  Wed Nov 30 12:42:51 2005
From: wetrull at yahoo.com (W. Eric Trull)
Date: Wed Nov 30 12:47:15 2005
Subject: [Biojava-l] help on blast
In-Reply-To: <20051130170135.58326.qmail@web81407.mail.mud.yahoo.com>
Message-ID: <20051130174251.95303.qmail@web81405.mail.mud.yahoo.com>

I have the same situation where I work, except I have a Swing client instead
of an applet.

I decided to use NCBI's BLAST implementation
(http://www.ncbi.nlm.nih.gov/BLAST/download.shtml) invoked using a command to
org.biojava.utils.ExecRunner.  I then wrapped the whole thing in a Web
Service, which is easier and more flexible than using RMI IMHO.  NCBI's BLAST
toolkit also contains the executable for building the BLAST database 
from a FASTA sequence file (formatdb.exe).

Be sure to set the BLAST output option to XML (-m 7) and use a
org.biojava.bio.program.sax.blastxml.BlastXMLParserFacade to parse the
output.  I had trouble using the default output as it is different under
Windows and *nix.  Look at the BioJava in Anger example of parsing BLAST
output if you need help here.

The one twist here is that you are constrained by the applet security model
which, I believe by default, will not allow you to go to a different server
for a Web Service unless you sign the applet.  Something for you to dig into
if you decided to use a Web Service.  The rest of my comments assume that you
are going to go down the Web Service path.

For creation of the Web Service I'm using webMethods GLUE, but that requires
a $$ license.  I've used Apache's Axis/Tomcat to build web services before
and it is pretty easy to use.  Building a web service future proofs, IMO, any
changes the powers that be may decided about the client side (i.e. "Now we
want a .NET application", etc.).

If you want a quick prototype, look at IBM's Web Services for Life Sciences
(http://www.alphaworks.ibm.com/tech/ws4LS).  They have a BLAST web service
that is downloadable and configurable to run in a local environment.  However
their services are a bit dated (February 7, 2003).

One last thought.  I'm working under the constraint that I cannot send my
query sequence outside my local network.  If you DO NOT have this restriction
and are just querying public databases, both the NCBI and PDB have web
services.  The PDB provides a SOAP over HTTP web service (WSDL at
http://pdbbeta.rcsb.org/pdbws/rcsbWebService?wsdl) which is currently BETA
but will go production January 1, 2006.  Point Axis at this WSDL to generate
client side code and then look for the blastQuery() methods.  The NCBI's web
service does not use SOAP, but provides an HTTP interface.  See
http://www.ncbi.nlm.nih.gov/BLAST/developer.shtml for documentation and a
Perl example.

Good luck!

-Eric Trull

--- Francois Pepin fpepin at cs.mcgill.ca wrote:

> Hi Parveen,
> 
> This might not be as easy as you might like.
> 
> The applet runs on the client, so you need the applet to communicate
> remotely to the server to send the sequence. Then the easiest way would
> be for the server to call blast on the command-line with the sequence
> (which is pretty easy), parse the result and send it back to the client
> applet.
> 
> I think RMI could do this, but I've never had to play with it.
> 
> Anyone has a better way to do this?
> 
> Francois
> 
> On Wed, 2005-11-30 at 16:04 +0530, Parveen k wrote:
> > Hi
> >    I'm pretty new to bioinformatics.i have to incorparate balst in my
> > applet.so that when the client enters the sequence ,it should perform the
> > blast search against the database we have and return the result.can
> anyone
> > guide me in this regard.
> > 
> > --
> > Regards
> > Parveen K
> > 
> > YOU MAY SAY I AM A DREAMER, BUT  I AM NOT THE ONLY ONE.
> > I HOPE SOMEDAY YOU WILL JOIN US, AND THE WORLD WILL FOLLOW US.
> >                                   - JOHN LENNON
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> 
> 
> 


Thanks.

-W. Eric Trull
From ady at sanger.ac.uk  Wed Nov 30 12:15:29 2005
From: ady at sanger.ac.uk (Andy Yates)
Date: Wed Nov 30 12:54:42 2005
Subject: [Biojava-l] Install question: classpath and use of jar files
	in	compiling TestEmbl.java
In-Reply-To: <c343d7080511300854x4f9b980bm7b9ce2ce7d4d3a96@mail.gmail.com>
References: <438DD04D.5060801@rice.edu>
	<c343d7080511300854x4f9b980bm7b9ce2ce7d4d3a96@mail.gmail.com>
Message-ID: <438DDE31.2030209@sanger.ac.uk>

That isn't the the problem and will actually cause more issues. By 
placing the library files in that directory you are exposing only the 
JRE to the libraries. Anyway the issue here is what you are trying to 
compile.

If you javac /homes/trodkey/biojava/demos/seq/TestEmbl.java you are 
telling the compiler that there is a class to compile which is located 
in the package homes.trodkey.biojava.org.biojava.bio. My suggestion is 
change directory to the root of biojava package and try the compilation 
from there.

Also check that your .java file's package declaration shows:

package demos.seq;

Otherwise this will fail again.

Regards,

Andy Yates
~~~~~~~~~~
CancerIT - Cancer Genome Project
Wellcome Trust Sanger Institute

Tamas Horvath wrote:
> the easiest way is to put the jar files to the jdkxxx/jre/lib/extU may have to link them to your jre's same directory (shuld u use differentjre then the jdk's)
> On 11/30/05, Travis Rodkey <trodkey@rice.edu> wrote:>> Hi - I am new to biojava and am trying to get it set up on Red Hat EE 3> running on a PC.   I have java 1.4.2 and jdk1.4.2.0 installed.>> I downloaded the necessary jar files (biojava1.4.jar, bytecode-0.92.jar,> commons-cli.jar, commons-collections-2.1.jar, commons-dbcp-1.1.jar,> commons-pool-1.1.jar) and placed them a /homes/trodkey/biojava directory> as per the "Getting Started" section.>> I then fixed the classpath variable to "setenv CLASSPATH> /homes/trodkey/biojava/biojava-> 1.4.jar:/homes/trodkey/biojava/bytecode-0.92.jar:> /homes/trodkey/biojava/commons-> cli.jar:/homes/trodkey/biojava/commons-collections-2.1.jar:> /homes/trodkey/biojava/commons-> dbcp-1.1.jar:/homes/trodkey/biojava/commons-pool-1.1.jar:."> as per the instructions.>> I then tried to compile TestEmbl.java as suggested in the "Getting> Started" section by using "javac> /homes/trodkey/biojava/demos/seq/TestEmbl.java", and got the error>> "/homes/tr
o!
> dkey/biojava/demos/seq/TestEmbl.java:5: package> homes.trodkey.biojava.org.biojava.bio does not exist", which was the> first of a string of 15 errors.>> Can anyone help me out?>> Thanks!>> -Travis Rodkey>> _______________________________________________> Biojava-l mailing list  -  Biojava-l@biojava.org> http://biojava.org/mailman/listinfo/biojava-l>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
From christoph.gille at charite.de  Wed Nov 30 16:05:45 2005
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Wed Nov 30 16:13:22 2005
Subject: [Biojava-l] blast in an applet applet
Message-ID: <64385.84.190.31.14.1133384745.squirrel@webmail.charite.de>

If you installed blast locally you would need a cron job updating the
blast database regularly.

Perhaps it is easier to  install a proxy server e.g. tiniproxy on the web
server where the applet is stored. The proxy can redirect the blast
request to any cgi server.


From hollandr at gis.a-star.edu.sg  Wed Nov 30 22:16:56 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Wed Nov 30 22:16:01 2005
Subject: [Biojava-l] modify structure
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5602894C96@BIONIC.biopolis.one-north.com>

See the addChain(), addModel() and setConnections() methods in the
Structure interface. For addChain() and addModel() also refer to the
Chain and Model interfaces in the same package.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Tamas Horvath
> Sent: Thursday, December 01, 2005 12:21 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] modify structure
> 
> 
> Is there any way to modify a protein structure by modifying 
> the contents ofthe Structure object?In short, I have a 
> Structure object, parsed from a pdb file, and I want 
> tointroduce point mutations to it, and save the modified 
> structure to a pdbfile for further analysis... (I intend to 
> use gromacs for instance if itmatters)...
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>