From boehme at mpiib-berlin.mpg.de  Thu Jun  2 09:03:30 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Thu Jun  2 08:55:31 2005
Subject: [Biojava-l] Re: [BioSQL-l] How to add a feature?
In-Reply-To: <0C528E3670D8CE4B8E013F6749231AA606E7F7@ANTARESIA.be.devgen.com>
References: <0C528E3670D8CE4B8E013F6749231AA606E7F7@ANTARESIA.be.devgen.com>
Message-ID: <429F03A2.1090208@mpiib-berlin.mpg.de>

Thanks Marc,
but I don't know how to make a feature persistent in Biojava. Maybe 
someone from the bioJava list can help me?

Martina

Marc Logghe wrote:

> Hi Martina,
> I don't know how it goes in BioJava but in BioPerl the flow looks like
> this:
> 1) create your feature
> 2) make it persistent
> 3) add it to your (persistent) sequence object
> 4) store the sequence object in the databse
> 5) commit if necessary
> 
> HTH,
> Marc
> 
> 
>>I'm wondering how to add a feature to a given sequence?
>>I know, I can use createFeature, but that changes nothing in 
>>the database, that does addSequence. So is the proper way to 
>>retrieve the seq., get all its features, copy it to new seq 
>>and add a feature, delete the seq in the database and store 
>>the new one?
>>There must be a simpler way? BioJava In Anger is rather 
>>sparse on things like that, I could do with a lot more examples ..
>>
>>Martina
>>_______________________________________________
>>BioSQL-l mailing list
>>BioSQL-l@open-bio.org
>>http://open-bio.org/mailman/listinfo/biosql-l
> 
> 
From jesse-t at chello.nl  Thu Jun  2 09:40:27 2005
From: jesse-t at chello.nl (Jesse)
Date: Thu Jun  2 09:34:08 2005
Subject: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols
Message-ID: <20050602134016.9AD142E02A@rbox4.erasmusmc.nl>

Can someone tell me how I can perform a BioJava 1.4pre1 regex search using
ambiguous symbols?

I'm using the following ambiguous DNA symbols:
(http://rebase.neb.com/rebase/link_withrefm)
-R = G or A
-Y = C or T
-M = A or C
-K = G or T
-S = G or C
-W = A or T
-B = not A (C or G or T)
-D = not C (A or G or T)
-H = not G (A or C or T)
-V = not T (A or C or G)
-N = A or C or G or T

If correct, to perform a BioJava-Regex, I need to make a PatternFactory
using the following method:

FiniteAlphabet fa = DNATools.getDNA();
org.biojava.utils.regex.PatternFactory.makeFactory(fa)

So I need a FiniteAlphabet containing ambiguous symbols right?
How can I make such FiniteAlphabet?

My goal is to perform a searchpattern like "g[agr]cg[cty]c" on a SymbolList
like "ATGCGACGTCTTAANNNNNNATGCAAC";

Thanks.

-Jesse

From mark.schreiber at novartis.com  Thu Jun  2 21:02:57 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Thu Jun  2 20:55:02 2005
Subject: [Biojava-l] Re: [BioSQL-l] How to add a feature?
Message-ID: <OF4650721E.809C3517-ON48257015.00058B5F-48257015.0005C312@EU.novartis.net>

>There must be a simpler way? BioJava In Anger is rather 
>sparse on things like that, I could do with a lot more examples ..
>

All donations of examples are gratefully received. As you say it could do 
with more examples but hey, I'm only one man, with a day job that is 
rapidly turning into a night job too : )

- Mark


From boehme at mpiib-berlin.mpg.de  Mon Jun  6 05:34:50 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun  6 05:27:28 2005
Subject: Bio Java (was: Re: [Biojava-l] Re: [BioSQL-l] How to add a feature?)
In-Reply-To: <OF4650721E.809C3517-ON48257015.00058B5F-48257015.0005C312@EU.novartis.net>
References: <OF4650721E.809C3517-ON48257015.00058B5F-48257015.0005C312@EU.novartis.net>
Message-ID: <42A418BA.8090407@mpiib-berlin.mpg.de>

Sorry - I didn't mean you personally! Because it is quite hard for me
to figure out how things are working just from the api and the
sources, I assumed it would be similar for others starting with
BioJava/BioSQL. There must be some working code around somewhere which
could be donated? Please do :-) It would increase the popularity of
BioJava/BioSQL, which it deserved, I would think.

Martina

mark.schreiber@novartis.com wrote:

>>There must be a simpler way? BioJava In Anger is rather 
>>sparse on things like that, I could do with a lot more examples ..
>>
> 
> 
> All donations of examples are gratefully received. As you say it could do 
> with more examples but hey, I'm only one man, with a day job that is 
> rapidly turning into a night job too : )
> 
> - Mark
> 
> 
From jesse-t at chello.nl  Mon Jun  6 05:49:17 2005
From: jesse-t at chello.nl (Jesse)
Date: Mon Jun  6 05:41:05 2005
Subject: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols
In-Reply-To: <BEC4DFC9.5C0F%sylvain.foisy@bioneq.qc.ca>
Message-ID: <20050606094906.EA0142E02F@rbox4.erasmusmc.nl>

Hi,

Thanks for your reply.

I'm using regex on SymbolLists instead of Strings, because I'm working with
large sequences stored in the memory. I think SymbolLists are more memory
efficient than Strings.

But my problem is solved now. I removed ambiguous symbols from the regex
pattern.

Regards,

Jesse

-----Original Message-----
From: Sylvain
Subject: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols

Hi,

Have a look at the MotifTools class. You'll find the createRegex method that
creates a regex with a degenerate sequence using a SymbolList that have the
sequence with degenerate letters.

It works great. The returned String can then be used with the usual
Pattern/Matcher classes in Java.

Hope this helps

Best regards

Sylvain

From jesse-t at chello.nl  Mon Jun  6 06:08:03 2005
From: jesse-t at chello.nl (Jesse)
Date: Mon Jun  6 05:59:43 2005
Subject: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols  
In-Reply-To: <200506052155.33599.c.lieftink@xs4all.nl>
Message-ID: <20050606100752.96D622E02A@rbox4.erasmusmc.nl>

Hi Cor,

Thanks for your reply.

I corrected the pattern by doing the following.

When BioJava's org.biojava.bio.molbio.RestrictionEnzyme.forwardRegex()
returns the regex of a RestrictionEnzyme "gtakm" it will return
"gta[gtk][acm]". In which k (G or T) and m (A or C) are ambiguous.

So the ambiguous symbol "k" is converted ambiguous "[gtk]", by putting the
"k" in the brackets.

I simply solved it by removed all ambiguous symbols from the returned regex
string.

String searchPattern = re.getForwardRegex().replaceAll("[rymkswbdhvn]", "");

Regards,

Jesse


-----Original Message-----
From: Cor 
Subject: RE: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols 

Hi Jesse, 

Although I am a newbie myself, I have written some example code based on 
existing BioJava-testcode :

String symbols = "atgcgacgtcttaannnnnnatgcaac";
SymbolList sl = DNATools.createDNA(symbols);
String patternString = "g[ag]cg[ct]c"; 
PatternFactory fact = PatternFactory.makeFactory(DNATools.getDNA()); 
 Pattern pattern = fact.compile(patternString); 
 Matcher matcher = pattern.matcher(sl);
if (matcher.find()) {
 	System.out.println("match found");
     }
 else {
 fail("failed to find target ");
 }
	
In the pattern, you have to use [ag] in stead of [agr]. Otherwise you will
get 
the error:
 org.biojava.utils.regex.RegexException: all variant symbols must be atomic.
at 
org.biojava.utils.regex.PatternChecker.parseVariantSymbols(PatternChecker.ja
va:363)


Regards,

Cor


From boehme at mpiib-berlin.mpg.de  Mon Jun  6 10:18:54 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun  6 10:13:15 2005
Subject: [Biojava-l] Re: [BioSQL-l] How to add a feature?
In-Reply-To: <429F0AE6.6020806@nrc-cnrc.gc.ca>
References: <0C528E3670D8CE4B8E013F6749231AA606E7F7@ANTARESIA.be.devgen.com>
	<429F03A2.1090208@mpiib-berlin.mpg.de>
	<429F0AE6.6020806@nrc-cnrc.gc.ca>
Message-ID: <42A45B4E.5070906@mpiib-berlin.mpg.de>

Thanks - I knew it would be quite simple, as always with BioJava (once 
  I've figuered out how to, that is)!
Martina

Simon Foote wrote:

> Hi Martina,
> 
> To add a feature to a sequence stored in a  BioSQL database, all you 
> have to do is retrieve the sequence and then add a feature to it.  The 
> following simplified code shows you the steps:
> 
> // Retrieve the sequence from BioSQLSequenceDB
> Sequence seq = bsd.getSequence(id);
> // Create new stranded feature
> StrandedFeature.Template templ = new StrandedFeature.Template();
> templ.location = ...
> templ.strand = ...
> templ.type = ...
> templ.source = ...
> templ.annotation = [A created SimpleAnnotation object]
> // Add feature to sequence
> seq.createFeature(templ);
> // Note: adding the feature like this will automatically persist the 
> feature, so you don't have to worry about doing that.
> 
> Cheers,
> Simon Foote
> 
From corlieftink at hotmail.com  Mon Jun  6 14:36:45 2005
From: corlieftink at hotmail.com (Cor Lieftink)
Date: Mon Jun  6 14:28:49 2005
Subject: [Biojava-l] BioJava libraries for cell modelling wanted?
Message-ID: <BAY13-F170A25EC932421D905B1C9AFFB0@phx.gbl>

Hi all,

Is anyone working on cell modelling as for example described in the article 
below (1)? And if so, is she (also) using bioJava for this and/or other open 
source projects? And if so, what kind of libraries would be helpfull for 
you?

Myself, I am a Java-programmer,  in daily life working for a bank,  but 
shortly in my own time I entered the field of bioinformatics.

Thanks for your reply in advance!

Regards,

Cor

(1) Cell Modeling Plays Role in Filling the Black Box
http://www.genpromag.com/ShowPR~PUBCODE~018~ACCT~1800000100~ISSUE~0504~RELTYPE~PR~ORIGRELTYPE~BIO~PRODCODE~00000000~PRODLETT~AG.html

_________________________________________________________________
MSN Webmessenger overal en altijd beschikbaar http://webmessenger.msn.com/

From mark.schreiber at novartis.com  Mon Jun  6 21:35:52 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Jun  6 21:27:57 2005
Subject: [Biojava-l] BioJava libraries for cell modelling wanted?
Message-ID: <OF4DE402BE.7125B50C-ON48257019.00088A8E-48257019.0008C725@EU.novartis.net>

It aint biojava. Both of the screen shots look like commercial metabolic 
engineering and cell modelling software.

There is a nice open source project called cellware that might be of 
interest though. www.bii.a-star.edu.sg/achievements/ applications/cellware/index.asp 

- Mark


"Cor Lieftink" <corlieftink@hotmail.com>
Sent by: biojava-l-bounces@portal.open-bio.org
06/07/2005 02:36 AM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] BioJava libraries for cell modelling wanted?


Hi all,

Is anyone working on cell modelling as for example described in the 
article 
below (1)? And if so, is she (also) using bioJava for this and/or other 
open 
source projects? And if so, what kind of libraries would be helpfull for 
you?

Myself, I am a Java-programmer,  in daily life working for a bank,  but 
shortly in my own time I entered the field of bioinformatics.

Thanks for your reply in advance!

Regards,

Cor

(1) Cell Modeling Plays Role in Filling the Black Box
http://www.genpromag.com/ShowPR~PUBCODE~018~ACCT~1800000100~ISSUE~0504~RELTYPE~PR~ORIGRELTYPE~BIO~PRODCODE~00000000~PRODLETT~AG.html

_________________________________________________________________
MSN Webmessenger overal en altijd beschikbaar http://webmessenger.msn.com/

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From bader at cbio.mskcc.org  Tue Jun  7 20:22:23 2005
From: bader at cbio.mskcc.org (Gary Bader)
Date: Tue Jun  7 20:14:24 2005
Subject: [Biojava-l] Homologene parser update?
Message-ID: <42A63A3F.2080308@cbio.mskcc.org>

Hi,
	I just tried the Homologene parser in biojava 1.4pre1 and noticed that 
it only supports the deprecated Homologene file format and throws a 
number of exceptions parsing that file (likely because of updates to the 
old file format).  Is there an update for this parser available 
anywhere?  I think it would be very useful.

Thanks,
Gary

From mark.schreiber at novartis.com  Tue Jun  7 20:54:15 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun  7 20:46:10 2005
Subject: [Biojava-l] Homologene parser update?
Message-ID: <OF66387917.DB9754A8-ON4825701A.0004CEE9-4825701A.0004F7D5@EU.novartis.net>

Hello -

It seems most of this was contributed by David Huen. I'm not sure if he 
plans an update. If the changes are not large then you might want to 
consider contributing the changes yourself.

Best of Luck

- Mark


Gary Bader <bader@cbio.mskcc.org>
Sent by: biojava-l-bounces@portal.open-bio.org
06/08/2005 08:22 AM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Homologene parser update?


Hi,
                 I just tried the Homologene parser in biojava 1.4pre1 and 
noticed that 
it only supports the deprecated Homologene file format and throws a 
number of exceptions parsing that file (likely because of updates to the 
old file format).  Is there an update for this parser available 
anywhere?  I think it would be very useful.

Thanks,
Gary

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From bader at cbio.mskcc.org  Wed Jun  8 17:53:03 2005
From: bader at cbio.mskcc.org (Gary Bader)
Date: Wed Jun  8 17:44:38 2005
Subject: [Biojava-l] Homologene parser update?
In-Reply-To: <OF66387917.DB9754A8-ON4825701A.0004CEE9-4825701A.0004F7D5@EU.novartis.net>
References: <OF66387917.DB9754A8-ON4825701A.0004CEE9-4825701A.0004F7D5@EU.novartis.net>
Message-ID: <42A768BF.7070901@cbio.mskcc.org>

Hi Mark,
	Thanks.  The changes that NCBI made to their file formats are large, so 
I could write another builder/parser for the new file format, but it 
would not map completely to the existing parser (which is now broken 
because of NCBI format changes to even the deprecated file format).  If 
I want to contribute code, who decides that the code is worthy (e.g. 
does a design review)?  Ideally, this would happen before I start coding.
	I haven't figured out if I am going to use the existing framework for 
my current project, since the file format has become simpler, but I 
would like to contribute if possible.

Thanks,
Gary

mark.schreiber@novartis.com wrote:
> Hello -
> 
> It seems most of this was contributed by David Huen. I'm not sure if he 
> plans an update. If the changes are not large then you might want to 
> consider contributing the changes yourself.
> 
> Best of Luck
> 
> - Mark
> 
> 
> 
> 
> 
> Gary Bader <bader@cbio.mskcc.org>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/08/2005 08:22 AM
> 
>  
>         To:     biojava-l@biojava.org
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        [Biojava-l] Homologene parser update?
> 
> 
> Hi,
>                  I just tried the Homologene parser in biojava 1.4pre1 and 
> noticed that 
> it only supports the deprecated Homologene file format and throws a 
> number of exceptions parsing that file (likely because of updates to the 
> old file format).  Is there an update for this parser available 
> anywhere?  I think it would be very useful.
> 
> Thanks,
> Gary
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
From mark.schreiber at novartis.com  Wed Jun  8 20:55:03 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun  8 20:47:05 2005
Subject: [Biojava-l] Homologene parser update?
Message-ID: <OF67BE6966.95357E97-ON4825701B.0004D5D2-4825701B.00050A9C@EU.novartis.net>

Hello -

The best way to proove the code is worthy is to provide JUnit tests with 
the code that provide good coverage of the functionality. If the tests 
pass the code is worthy. Other more esoteric things like good API design 
and efficiency are also nice but if the unit tests don't pass it doesn't 
work.

Let us know what you plan to do.

- Mark


Gary Bader <bader@cbio.mskcc.org>
06/09/2005 05:53 AM

 
        To:     Mark Schreiber/GP/Novartis@PH
        cc:     biojava-l@biojava.org, smh1008@cus.cam.ac.uk
        Subject:        Re: [Biojava-l] Homologene parser update?


Hi Mark,
                 Thanks.  The changes that NCBI made to their file formats 
are large, so 
I could write another builder/parser for the new file format, but it 
would not map completely to the existing parser (which is now broken 
because of NCBI format changes to even the deprecated file format).  If 
I want to contribute code, who decides that the code is worthy (e.g. 
does a design review)?  Ideally, this would happen before I start coding.
                 I haven't figured out if I am going to use the existing 
framework for 
my current project, since the file format has become simpler, but I 
would like to contribute if possible.

Thanks,
Gary

mark.schreiber@novartis.com wrote:
> Hello -
> 
> It seems most of this was contributed by David Huen. I'm not sure if he 
> plans an update. If the changes are not large then you might want to 
> consider contributing the changes yourself.
> 
> Best of Luck
> 
> - Mark
> 
> 
> 
> 
> 
> Gary Bader <bader@cbio.mskcc.org>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/08/2005 08:22 AM
> 
> 
>         To:     biojava-l@biojava.org
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        [Biojava-l] Homologene parser update?
> 
> 
> Hi,
>                  I just tried the Homologene parser in biojava 1.4pre1 
and 
> noticed that 
> it only supports the deprecated Homologene file format and throws a 
> number of exceptions parsing that file (likely because of updates to the 

> old file format).  Is there an update for this parser available 
> anywhere?  I think it would be very useful.
> 
> Thanks,
> Gary
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 


From bader at cbio.mskcc.org  Wed Jun  8 21:15:00 2005
From: bader at cbio.mskcc.org (Gary Bader)
Date: Wed Jun  8 21:06:48 2005
Subject: [Biojava-l] Homologene parser update?
In-Reply-To: <OF67BE6966.95357E97-ON4825701B.0004D5D2-4825701B.00050A9C@EU.novartis.net>
References: <OF67BE6966.95357E97-ON4825701B.0004D5D2-4825701B.00050A9C@EU.novartis.net>
Message-ID: <42A79814.3050400@cbio.mskcc.org>

Hi,
	I just wrote a homologene parser that includes unit tests.  NCBI split 
homologene into simple + complex file formats - I only parse the simple 
format.  Should I send you the code?  I doubt you would want to 
integrate it now, since I can imagine a few more useful methods, but, 
with design pointers, I could extend the code to be more useful.  Should 
we take this discussion off the list?

Cheers,
Gary

mark.schreiber@novartis.com wrote:
> Hello -
> 
> The best way to proove the code is worthy is to provide JUnit tests with 
> the code that provide good coverage of the functionality. If the tests 
> pass the code is worthy. Other more esoteric things like good API design 
> and efficiency are also nice but if the unit tests don't pass it doesn't 
> work.
> 
> Let us know what you plan to do.
> 
> - Mark
> 
> 
> 
> 
> 
> Gary Bader <bader@cbio.mskcc.org>
> 06/09/2005 05:53 AM
> 
>  
>         To:     Mark Schreiber/GP/Novartis@PH
>         cc:     biojava-l@biojava.org, smh1008@cus.cam.ac.uk
>         Subject:        Re: [Biojava-l] Homologene parser update?
> 
> 
> Hi Mark,
>                  Thanks.  The changes that NCBI made to their file formats 
> are large, so 
> I could write another builder/parser for the new file format, but it 
> would not map completely to the existing parser (which is now broken 
> because of NCBI format changes to even the deprecated file format).  If 
> I want to contribute code, who decides that the code is worthy (e.g. 
> does a design review)?  Ideally, this would happen before I start coding.
>                  I haven't figured out if I am going to use the existing 
> framework for 
> my current project, since the file format has become simpler, but I 
> would like to contribute if possible.
> 
> Thanks,
> Gary
> 
> mark.schreiber@novartis.com wrote:
> 
>>Hello -
>>
>>It seems most of this was contributed by David Huen. I'm not sure if he 
>>plans an update. If the changes are not large then you might want to 
>>consider contributing the changes yourself.
>>
>>Best of Luck
>>
>>- Mark
>>
>>
>>
>>
>>
>>Gary Bader <bader@cbio.mskcc.org>
>>Sent by: biojava-l-bounces@portal.open-bio.org
>>06/08/2005 08:22 AM
>>
>>
>>        To:     biojava-l@biojava.org
>>        cc:     (bcc: Mark Schreiber/GP/Novartis)
>>        Subject:        [Biojava-l] Homologene parser update?
>>
>>
>>Hi,
>>                 I just tried the Homologene parser in biojava 1.4pre1 
> 
> and 
> 
>>noticed that 
>>it only supports the deprecated Homologene file format and throws a 
>>number of exceptions parsing that file (likely because of updates to the 
> 
> 
>>old file format).  Is there an update for this parser available 
>>anywhere?  I think it would be very useful.
>>
>>Thanks,
>>Gary
>>
>>_______________________________________________
>>Biojava-l mailing list  -  Biojava-l@biojava.org
>>http://biojava.org/mailman/listinfo/biojava-l
>>
>>
>>
> 
> 
> 
> 
From great_fred at yahoo.com  Fri Jun 10 10:18:29 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Fri Jun 10 10:10:16 2005
Subject: [Biojava-l] Parse a Blast
Message-ID: <20050610141829.41126.qmail@web32210.mail.mud.yahoo.com>

Hello

First of all, sorry for my English (I'm French..)
Now, my problem...
I have Blast in XML format. I want to parse it because I want just the
alignment.
But, when I use a script found on the Net, the answer is :
--> org.xml.sax.SAXException: Could not recognise the format of this
file as one supported by the framework.

Does anybody have the same problem?
Or Does anybody have an idea to resolve my problem?

Here is the script (Sorry, the comment are in French..)

   import java.io.*;
   import java.util.*;

   import org.biojava.bio.program.sax.*;
   import org.biojava.bio.program.ssbind.*;
   import org.biojava.bio.search.*;
   import org.biojava.bio.seq.db.*;
   import org.xml.sax.*;
   import org.biojava.bio.*;

    public class BlastParser {
   /**
   * args[0] est assum? ?tre le nom du fichier de sortie BLAST */
       public static void main(String[] args) {
         try {
         //obtenir les entr?es Blast sous la forme de Stream
            InputStream is = new FileInputStream(args[0]);
         
         //construire un BlastLikeSAXParser
            BlastLikeSAXParser parser = new BlastLikeSAXParser();
         
         //construire un adaptateur pour SAX event qui les passera a un
Handler.
            SeqSimilarityAdapter adapter = new SeqSimilarityAdapter();
         
         //initialiser l'adaptateur des SAX events  de l'objet parser
            parser.setContentHandler(adapter);
         
         //la liste qui contiendra les SeqSimilaritySearchResults
            List results = new ArrayList();
         
         //cr?er le SearchContentHandler qui construira les
SeqSimilaritySearchResults
         //dans la liste results
            SearchContentHandler builder = new
BlastLikeSearchBuilder(results,
               new DummySequenceDB("queries"), new
DummySequenceDBInstallation());
         
         //enregistrer builder aupres de adapter
            adapter.setSearchContentHandler(builder);
         
         //parcourir le fichier; apr?s, la liste result contiendra
         //les SeqSimilaritySearchResults
         
            parser.parse(new InputSource(is));
            //formatResults(results);
         }
             catch (SAXException ex) {
            //probleme de XML
               ex.printStackTrace();
            }
             catch (IOException ex) {
            //probleme de IO, comme un fichier introuvable
               ex.printStackTrace();
            }
      }
   }

Thank you for any answer...

Great-Fred


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From mark.schreiber at novartis.com  Sun Jun 12 22:05:58 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Sun Jun 12 21:57:44 2005
Subject: [Biojava-l] Parse a Blast
Message-ID: <OF0E9ED6E1.26EB948B-ON4825701F.000B0677-4825701F.000B88A5@EU.novartis.net>

Hello -

I suspect the problem is that you are using the BlastLikeSAXParser. This 
was written in the days before blast xml was available (and stable) and is 
an adapter that parses a non-xml blast report and produces SAX events.

The parser you want is org.biojava.bio.program.sax.blastxml.BlastXMLParser

Hope this helps,

- Mark


S?bastien PETIT <great_fred@yahoo.com>
Sent by: biojava-l-bounces@portal.open-bio.org
06/10/2005 10:18 PM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Parse a Blast


Hello

First of all, sorry for my English (I'm French..)
Now, my problem...
I have Blast in XML format. I want to parse it because I want just the
alignment.
But, when I use a script found on the Net, the answer is :
--> org.xml.sax.SAXException: Could not recognise the format of this
file as one supported by the framework.

Does anybody have the same problem?
Or Does anybody have an idea to resolve my problem?

Here is the script (Sorry, the comment are in French..)

   import java.io.*;
   import java.util.*;

   import org.biojava.bio.program.sax.*;
   import org.biojava.bio.program.ssbind.*;
   import org.biojava.bio.search.*;
   import org.biojava.bio.seq.db.*;
   import org.xml.sax.*;
   import org.biojava.bio.*;

    public class BlastParser {
   /**
   * args[0] est assum? ?tre le nom du fichier de sortie BLAST */
       public static void main(String[] args) {
         try {
         //obtenir les entr?es Blast sous la forme de Stream
            InputStream is = new FileInputStream(args[0]);
 
         //construire un BlastLikeSAXParser
            BlastLikeSAXParser parser = new BlastLikeSAXParser();
 
         //construire un adaptateur pour SAX event qui les passera a un
Handler.
            SeqSimilarityAdapter adapter = new SeqSimilarityAdapter();
 
         //initialiser l'adaptateur des SAX events  de l'objet parser
            parser.setContentHandler(adapter);
 
         //la liste qui contiendra les SeqSimilaritySearchResults
            List results = new ArrayList();
 
         //cr?er le SearchContentHandler qui construira les
SeqSimilaritySearchResults
         //dans la liste results
            SearchContentHandler builder = new
BlastLikeSearchBuilder(results,
               new DummySequenceDB("queries"), new
DummySequenceDBInstallation());
 
         //enregistrer builder aupres de adapter
            adapter.setSearchContentHandler(builder);
 
         //parcourir le fichier; apr?s, la liste result contiendra
         //les SeqSimilaritySearchResults
 
            parser.parse(new InputSource(is));
            //formatResults(results);
         }
             catch (SAXException ex) {
            //probleme de XML
               ex.printStackTrace();
            }
             catch (IOException ex) {
            //probleme de IO, comme un fichier introuvable
               ex.printStackTrace();
            }
      }
   }

Thank you for any answer...

Great-Fred


___________________________________________________________________________ 

Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 

T?l?chargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From great_fred at yahoo.com  Mon Jun 13 08:47:36 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Mon Jun 13 08:44:10 2005
Subject: [Biojava-l] Parse a Blast
In-Reply-To: <OF0E9ED6E1.26EB948B-ON4825701F.000B0677-4825701F.000B88A5@EU.novartis.net>
Message-ID: <20050613124736.57341.qmail@web32211.mail.mud.yahoo.com>


Hello, Mark

Thanks for the answer...
The parser you talked me about, may be the good one...
But I'm maybe not enought good in Java and I can't use it.
I spent all the morning trying to understand this class and how it
works, but, nothing....I didn't understand how I can use it...

Thank you for any additional help

Great-Fred


--- mark.schreiber@novartis.com a ?crit :

> Hello -
> 
> I suspect the problem is that you are using the BlastLikeSAXParser.
> This 
> was written in the days before blast xml was available (and stable)
> and is 
> an adapter that parses a non-xml blast report and produces SAX
> events.
> 
> The parser you want is
> org.biojava.bio.program.sax.blastxml.BlastXMLParser
> 
> Hope this helps,
> 
> - Mark
> 
> 
> 
> 
> S?bastien PETIT <great_fred@yahoo.com>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/10/2005 10:18 PM
> 
>  
>         To:     biojava-l@biojava.org
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        [Biojava-l] Parse a Blast
> 
> 
> Hello
> 
> First of all, sorry for my English (I'm French..)
> Now, my problem...
> I have Blast in XML format. I want to parse it because I want just
> the
> alignment.
> But, when I use a script found on the Net, the answer is :
> --> org.xml.sax.SAXException: Could not recognise the format of this
> file as one supported by the framework.
> 
> Does anybody have the same problem?
> Or Does anybody have an idea to resolve my problem?
> 
> Here is the script (Sorry, the comment are in French..)
> 
>    import java.io.*;
>    import java.util.*;
> 
>    import org.biojava.bio.program.sax.*;
>    import org.biojava.bio.program.ssbind.*;
>    import org.biojava.bio.search.*;
>    import org.biojava.bio.seq.db.*;
>    import org.xml.sax.*;
>    import org.biojava.bio.*;
> 
>     public class BlastParser {
>    /**
>    * args[0] est assum? ?tre le nom du fichier de sortie BLAST */
>        public static void main(String[] args) {
>          try {
>          //obtenir les entr?es Blast sous la forme de Stream
>             InputStream is = new FileInputStream(args[0]);
>  
>          //construire un BlastLikeSAXParser
>             BlastLikeSAXParser parser = new BlastLikeSAXParser();
>  
>          //construire un adaptateur pour SAX event qui les passera a
> un
> Handler.
>             SeqSimilarityAdapter adapter = new
> SeqSimilarityAdapter();
>  
>          //initialiser l'adaptateur des SAX events  de l'objet parser
>             parser.setContentHandler(adapter);
>  
>          //la liste qui contiendra les SeqSimilaritySearchResults
>             List results = new ArrayList();
>  
>          //cr?er le SearchContentHandler qui construira les
> SeqSimilaritySearchResults
>          //dans la liste results
>             SearchContentHandler builder = new
> BlastLikeSearchBuilder(results,
>                new DummySequenceDB("queries"), new
> DummySequenceDBInstallation());
>  
>          //enregistrer builder aupres de adapter
>             adapter.setSearchContentHandler(builder);
>  
>          //parcourir le fichier; apr?s, la liste result contiendra
>          //les SeqSimilaritySearchResults
>  
>             parser.parse(new InputSource(is));
>             //formatResults(results);
>          }
>              catch (SAXException ex) {
>             //probleme de XML
>                ex.printStackTrace();
>             }
>              catch (IOException ex) {
>             //probleme de IO, comme un fichier introuvable
>                ex.printStackTrace();
>             }
>       }
>    }
> 
> Thank you for any answer...
> 
> Great-Fred
> 
> 
>  
> 
>  
>  
>
___________________________________________________________________________
> 
> 
> Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo!
> Messenger 
> 
> T?l?chargez cette version sur http://fr.messenger.yahoo.com
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> 


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From Russell.Smithies at agresearch.co.nz  Mon Jun 13 21:51:26 2005
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon Jun 13 21:43:12 2005
Subject: [Biojava-l] OT: JTest
Message-ID: <D5DBA313349A4B458528BE63B387F36C936A5B@imail.agresearch.co.nz>


Hi all,

Sorry about the off-topic question but has anyone tried JTest from
ParaSoft?
We're thinking of buying it and all the reviews I've read seem OK but
I'd like to hear comments from someone who actually uses it.

Thanx,


Russell Smithies
Bioinformatics Software Developer
Invermay  Research Centre
Puddle Alley, Mosgiel, New Zealand
www.agresearch.co.nz


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================

From franckv at ebi.ac.uk  Tue Jun 14 10:47:26 2005
From: franckv at ebi.ac.uk (Franck Valentin)
Date: Tue Jun 14 10:39:38 2005
Subject: [Biojava-l] Applet and bytecore.jar
Message-ID: <1118760446.12636.113.camel@pongo.ebi.ac.uk>

Hi all,

I want to create an applet which displays feature tables graphically. As
a test and to learn biojava I adapted the FastBeadDemo.java example. 
The problem is that it works fine as a standalone application but when
use it as an applet I get the following error :

java.lang.ExceptionInInitializerError
       at
org.biojava.bio.gui.sequence.FilteringRenderer.getContext(FilteringRenderer.java:171)
[...]

Caused by: java.security.AccessControlException: access denied
(java.lang.RuntimePermission createClassLoader)
        at
java.security.AccessControlContext.checkPermission(AccessControlContext.java:269)
[...]


I guess it comes from the bytecore.jar library I need to use and which
tries to create a ClassLoader in an applet context.

Does that mean I need to create something like a signed applet (I know
very little about that !) to use the biojava libray, or does a turn
around exist ?
By the way what bytecore.jar is used for ?


Many thanks


Franck
From mark.schreiber at novartis.com  Tue Jun 14 21:09:18 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 14 21:02:21 2005
Subject: [Biojava-l] Re: [Biojava-dev] Local binary execution
Message-ID: <OF57EFE134.61810F49-ON48257021.000627CF-48257021.0006585D@EU.novartis.net>

We would normally not like to use a new JDK in biojava unless it is well 
supported on all the OS's people are using. Having said that there are 
several attractive features which would make it nice to use.

Is anyones current OS not supporting java 1.5?

- Mark 


Michael Barton <michael.barton1@ncl.ac.uk>
Sent by: biojava-dev-bounces@portal.open-bio.org
06/15/2005 02:26 AM

 
        To:     BioJava-dev <biojava-dev@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        Re: [Biojava-dev] Local binary execution


I had a look at the post you were refering to. In terms of the ant
support for local binary execution I think it is very similar to the
newly implemented ProcessBuilder in Java 1.5.
This class has a similar way way of adding command line arguments to
that of ant <exec>.

The classes I'm suggesting have an enum of arguments specific to the
application which may convienient for suppling different switch/argument
pairs, as it it means that only arguments for which the binary allows
can be supplied.
Any errors should therefore come from incorrent argument values rather
than incorrect arguments. If that makes sense.
In addition the class throws an exception if the essential arguments
required to run the search are not supplied.

This means however that the classes are written in Java 1.5. Would this
be a problem?


On Thu, 2005-06-09 at 11:54 -0400, Michael Heuer wrote:
> Hello Michael,
> 
> Personally I think this kind of code might be better suited in a more
> general library, say in an Apache Jakarta Commons project for example.
> 
> In fact, there was just a proposal to pull the exec code out of ant into 
a
> separate self-contained library to the commonds-dev mailing list a 
couple
> of days ago:
> 
> > http://tinyurl.com/9culs
> 
> That said, this comes up quite frequently here, so perhaps we should 
just
> bite the bullet and do it up right.
> 
>    michael
> 
> 
> On Thu, 9 Jun 2005, Michael Barton wrote:
> 
> >
> > Hi,
> >
> > I'm Bioinformatics MRes student at Newcastle. I've been messing around
> > with some java code to execute bioinformatics binaries. It was
> > originally intended for blast but has also been extended for genewise.
> > It takes the hassle out of using process / process builder a little 
bit.
> >
> > Use goes along the lines of something like this
> >
> > //Search factory for creating searches
> > SearchFactory<BlastSearch, BlastSearchFactory.Parameter> bsf;
> > bsf = new BlastSearchFactory();
> >
> > //Paramterise with search specific variables
> > bsf.setSearchBinaryLocation(test_data + "/blast/binary");
> > 
bsf.setSearchParameter(BlastSearchFactory.Parameter.blastType,"blastn");
> > bsf.setSearchParameter(BlastSearchFactory.Parameter.database,
> >     test_data + "/blast/db/sargasso");
> >
> > //Create immutable search object which can be used to run mutiple
> > searches on the same database
> > Search<BlastSearchResult> blastSearch = bsf.getSearch();
> >
> > Simple search result object which returns inputstream
> > SearchResult sr = blastSearch.execute(new File(test_data +
> > "/blast/query/query"));
> >
> > InputStream is = sr.getResultStream();
> >
> > It's seems to work okay on linux, I haven't tested it on windows.
> >
> > There's a little bit of JavaDoc I started work on but it's a little 
bit
> > messed up from where I've been changing things around.
> >
> > The source/jar/doc are all here. There's test cases too.
> >
> > http://www.students.ncl.ac.uk/michael.barton1/
> >
> > Mike
> >
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-dev
> >
> 

_______________________________________________
biojava-dev mailing list
biojava-dev@biojava.org
http://biojava.org/mailman/listinfo/biojava-dev


From mark.schreiber at novartis.com  Tue Jun 14 21:11:50 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 14 21:03:49 2005
Subject: [Biojava-l] LSID
Message-ID: <OF4B0E6C93.28CEE738-ON48257021.00065D74-48257021.000693CA@EU.novartis.net>

Hello -

Does anyone know what happened to the Life Science Identifier proposal? I 
notice that there are some classes in biojava to handle it but I'm not 
sure it was ever widely accepted by the community. Come to think of it, 
does anyone know what happened to the I3C who proposed it?

If it's all dead or dying maybe it should be deprecated or removed at a 
later date?

- Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com

phone +65 6722 2973
fax  +65 6722 2910

From hollandr at gis.a-star.edu.sg  Tue Jun 14 21:36:09 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Tue Jun 14 21:30:46 2005
Subject: [Biojava-l] Re: [Biojava-dev] Local binary execution
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA935@BIONIC.biopolis.one-north.com>

Linux supports Java 1.5 but only using the Sun JDK on ia32 and AMD
Opterons. Support for other architectures on Linux (such as ia64, PPC,
or Alpha) is restricted to specialist provisions from vendors such as HP
and the open source efforts such as Blackdown JDK. At a quick check, the
Alpha is only at 1.4.2 (from HP), likewise PPC (from IBM), whereas ia64
can run 1.5 apps using HP's JRE but no compiler yet exists for them.
There may also be some open source purists out there who object when
they can't use their favourite open source JDK any more...


Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> mark.schreiber@novartis.com
> Sent: Wednesday, June 15, 2005 9:09 AM
> To: Michael Barton
> Cc: biojava-l@open-bio.org; BioJava-dev
> Subject: [Biojava-l] Re: [Biojava-dev] Local binary execution
> 
> 
> We would normally not like to use a new JDK in biojava unless 
> it is well 
> supported on all the OS's people are using. Having said that 
> there are 
> several attractive features which would make it nice to use.
> 
> Is anyones current OS not supporting java 1.5?
> 
> - Mark 
> 
> 
> 
> 
> 
> Michael Barton <michael.barton1@ncl.ac.uk>
> Sent by: biojava-dev-bounces@portal.open-bio.org
> 06/15/2005 02:26 AM
> 
>  
>         To:     BioJava-dev <biojava-dev@biojava.org>
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        Re: [Biojava-dev] Local binary execution
> 
> 
> I had a look at the post you were refering to. In terms of the ant
> support for local binary execution I think it is very similar to the
> newly implemented ProcessBuilder in Java 1.5.
> This class has a similar way way of adding command line arguments to
> that of ant <exec>.
> 
> The classes I'm suggesting have an enum of arguments specific to the
> application which may convienient for suppling different 
> switch/argument
> pairs, as it it means that only arguments for which the binary allows
> can be supplied.
> Any errors should therefore come from incorrent argument values rather
> than incorrect arguments. If that makes sense.
> In addition the class throws an exception if the essential arguments
> required to run the search are not supplied.
> 
> This means however that the classes are written in Java 1.5. 
> Would this
> be a problem?
> 
> 
> On Thu, 2005-06-09 at 11:54 -0400, Michael Heuer wrote:
> > Hello Michael,
> > 
> > Personally I think this kind of code might be better suited 
> in a more
> > general library, say in an Apache Jakarta Commons project 
> for example.
> > 
> > In fact, there was just a proposal to pull the exec code 
> out of ant into 
> a
> > separate self-contained library to the commonds-dev mailing list a 
> couple
> > of days ago:
> > 
> > > http://tinyurl.com/9culs
> > 
> > That said, this comes up quite frequently here, so perhaps 
> we should 
> just
> > bite the bullet and do it up right.
> > 
> >    michael
> > 
> > 
> > On Thu, 9 Jun 2005, Michael Barton wrote:
> > 
> > >
> > > Hi,
> > >
> > > I'm Bioinformatics MRes student at Newcastle. I've been 
> messing around
> > > with some java code to execute bioinformatics binaries. It was
> > > originally intended for blast but has also been extended 
> for genewise.
> > > It takes the hassle out of using process / process 
> builder a little 
> bit.
> > >
> > > Use goes along the lines of something like this
> > >
> > > //Search factory for creating searches
> > > SearchFactory<BlastSearch, BlastSearchFactory.Parameter> bsf;
> > > bsf = new BlastSearchFactory();
> > >
> > > //Paramterise with search specific variables
> > > bsf.setSearchBinaryLocation(test_data + "/blast/binary");
> > > 
> bsf.setSearchParameter(BlastSearchFactory.Parameter.blastType,
> "blastn");
> > > bsf.setSearchParameter(BlastSearchFactory.Parameter.database,
> > >     test_data + "/blast/db/sargasso");
> > >
> > > //Create immutable search object which can be used to run mutiple
> > > searches on the same database
> > > Search<BlastSearchResult> blastSearch = bsf.getSearch();
> > >
> > > Simple search result object which returns inputstream
> > > SearchResult sr = blastSearch.execute(new File(test_data +
> > > "/blast/query/query"));
> > >
> > > InputStream is = sr.getResultStream();
> > >
> > > It's seems to work okay on linux, I haven't tested it on windows.
> > >
> > > There's a little bit of JavaDoc I started work on but 
> it's a little 
> bit
> > > messed up from where I've been changing things around.
> > >
> > > The source/jar/doc are all here. There's test cases too.
> > >
> > > http://www.students.ncl.ac.uk/michael.barton1/
> > >
> > > Mike
> > >
> > > _______________________________________________
> > > biojava-dev mailing list
> > > biojava-dev@biojava.org
> > > http://biojava.org/mailman/listinfo/biojava-dev
> > >
> > 
> 
> _______________________________________________
> biojava-dev mailing list
> biojava-dev@biojava.org
> http://biojava.org/mailman/listinfo/biojava-dev
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From bader at cbio.mskcc.org  Tue Jun 14 22:20:49 2005
From: bader at cbio.mskcc.org (Gary Bader)
Date: Tue Jun 14 22:12:03 2005
Subject: [Biojava-l] LSID
In-Reply-To: <OF4B0E6C93.28CEE738-ON48257021.00065D74-48257021.000693CA@EU.novartis.net>
References: <OF4B0E6C93.28CEE738-ON48257021.00065D74-48257021.000693CA@EU.novartis.net>
Message-ID: <42AF9081.5000505@cbio.mskcc.org>

LSID is still cooking, but is not widely accepted, but there are a 
number of people still pushing for it.  IBM is the current caretaker. 
It remains to be seen whether it will be widely adopted by e.g. the 
sequence databases.

http://lsid.sourceforge.net/

Gary

mark.schreiber@novartis.com wrote:
> Hello -
> 
> Does anyone know what happened to the Life Science Identifier proposal? I 
> notice that there are some classes in biojava to handle it but I'm not 
> sure it was ever widely accepted by the community. Come to think of it, 
> does anyone know what happened to the I3C who proposed it?
> 
> If it's all dead or dying maybe it should be deprecated or removed at a 
> later date?
> 
> - Mark
> 
> Mark Schreiber
> Principal Scientist (Bioinformatics)
> 
> Novartis Institute for Tropical Diseases (NITD)
> 10 Biopolis Road
> #05-01 Chromos
> Singapore 138670
> www.nitd.novartis.com
> 
> phone +65 6722 2973
> fax  +65 6722 2910
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
From terry at triplett.org  Tue Jun 14 22:27:19 2005
From: terry at triplett.org (Terry L. Triplett)
Date: Tue Jun 14 22:19:28 2005
Subject: [Biojava-l] Re: [Biojava-dev] Local binary execution
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA935@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA935@BIONIC.biopolis.one-north.com>
Message-ID: <42AF9207.7000608@triplett.org>

Not to be pedantic, and only peripherally on topic, but Blackdown is 
not, and has never been open source.  Back before Sun became interested 
in supporting a JDK on Linux, the Blackdown folks made it possible by 
signing whatever NDA was required and getting access to the JDK source.  
When Sun did become interested in Linux Java, the Sun JDK for Linux was 
the Blackdown codebase, plus some stuff from Borland/Inprise/whatever.  
These days the Sun JDK and Blackdown's version are more or less 
equivalent, as I understand it. 

Richard HOLLAND wrote:

>and the open source efforts such as Blackdown JDK.
>  
>
From mark.schreiber at novartis.com  Tue Jun 14 23:44:11 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 14 23:36:06 2005
Subject: [Biojava-l] Announce: BioJava 1.4pre2
Message-ID: <OF05DB316D.8172EA64-ON48257021.0013CA03-48257021.00148695@EU.novartis.net>

Hello All -

A second release candidate for biojava 1.4 is now out. Apart from a years 
worth of bug fixes and javadoc clean ups the major change over 1.4pre1 is 
a major work over of the biosql bindings so that BioJava now operates with 
the upcoming biosql 1.0.

Please take this code out for a spin and give your feedback to the list. I 
hope to make an official release in about a week so we can start working 
on 1.5. It's certainly been a long time between releases and I would like 
to reduce this in the near future.

Check it out from www.biojava.org or go directly to 
http://www.biojava.org/download14.html (do not pass Go, do not collect 
$200).

Thanks to Michael Heuer and Richard Holland for helping to squeeze this 
one out.

- Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com

phone +65 6722 2973
fax  +65 6722 2910

From heuermh at acm.org  Wed Jun 15 15:27:38 2005
From: heuermh at acm.org (Michael Heuer)
Date: Wed Jun 15 15:22:48 2005
Subject: [Biojava-l] LSID
In-Reply-To: <42AF9081.5000505@cbio.mskcc.org>
Message-ID: <Pine.GSO.4.44.0506151513560.5673-100000@shell3.shore.net>


The biojava LSID and the IBM LSID are slightly different APIs, the IBM one
the more complete of the two.  There also are/were LSID client
implementations that I'm not very familiar with in taverna [0] and for
whatever reason in an email client called Haystack [1].

I would move the biojava LSID implementation for deprecation after
release of version 1.4.x but note that it is used internally, see e.g.

org/biojava/utils/lsid/class-use/LifeScienceIdentifier.html

in the 1.4pre2 javadocs.

   michael


[0] http://taverna.sf.net
[1] http://haystack.lcs.mit.edu


On Tue, 14 Jun 2005, Gary Bader wrote:

> LSID is still cooking, but is not widely accepted, but there are a
> number of people still pushing for it.  IBM is the current caretaker.
> It remains to be seen whether it will be widely adopted by e.g. the
> sequence databases.
>
> http://lsid.sourceforge.net/
>
> Gary
>
> mark.schreiber@novartis.com wrote:
> > Hello -
> >
> > Does anyone know what happened to the Life Science Identifier proposal? I
> > notice that there are some classes in biojava to handle it but I'm not
> > sure it was ever widely accepted by the community. Come to think of it,
> > does anyone know what happened to the I3C who proposed it?
> >
> > If it's all dead or dying maybe it should be deprecated or removed at a
> > later date?
> >
> > - Mark
> >
> > Mark Schreiber
> > Principal Scientist (Bioinformatics)
> >
> > Novartis Institute for Tropical Diseases (NITD)
> > 10 Biopolis Road
> > #05-01 Chromos
> > Singapore 138670
> > www.nitd.novartis.com
> >
> > phone +65 6722 2973
> > fax  +65 6722 2910
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>

From avinash at lanl.gov  Wed Jun 15 15:34:50 2005
From: avinash at lanl.gov (Avinash Kewalramani)
Date: Wed Jun 15 15:28:13 2005
Subject: [Biojava-l] Phrap output
Message-ID: <42B082DA.4030903@lanl.gov>

Hi

I need store some information from an ace assembly file(which is Phrap 
plain text output). To do this I will have to write my own parses to 
parse this complicated text file.

Is there any class In bioJava or anywhere else which does this.The best 
scenario would be if some code converts this file to xml output which 
can be easily parsed

I have looked around a bit in Biojava and elsewhere and couldn't find 
anything for this. I dont want to use Perl(BioPerl probably has this)

Thanks

-- 
----------------------------------------------------------------------
Avinash Kewalramani
Technical Lead-Genome Informatics Group
Bioscience Division
Los Alamos National Laboratory 
Los Alamos, NM 87545

Phone: 505-664-0527
Cell:  816-213-1908
E-mail:  avinash@lanl.gov
----------------------------------------------------------------------

From mark.schreiber at novartis.com  Wed Jun 15 21:34:10 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 15 21:26:12 2005
Subject: [Biojava-l] LSID
Message-ID: <OFB7B28C92.D0AA7E59-ON48257022.000887BE-48257022.00089F08@EU.novartis.net>

The internal use was mine (I was just using it as a substitute for a 
namespace). Maybe we should upgrade it to be compatable with IBM or 
Taverna?

- Mark


Michael Heuer <heuermh@acm.org>
Sent by: biojava-l-bounces@portal.open-bio.org
06/16/2005 03:27 AM

 
        To:     Gary Bader <bader@cbio.mskcc.org>
        cc:     biojava-l@open-bio.org, Mark Schreiber/GP/Novartis@PH
        Subject:        Re: [Biojava-l] LSID


The biojava LSID and the IBM LSID are slightly different APIs, the IBM one
the more complete of the two.  There also are/were LSID client
implementations that I'm not very familiar with in taverna [0] and for
whatever reason in an email client called Haystack [1].

I would move the biojava LSID implementation for deprecation after
release of version 1.4.x but note that it is used internally, see e.g.

org/biojava/utils/lsid/class-use/LifeScienceIdentifier.html

in the 1.4pre2 javadocs.

   michael


[0] http://taverna.sf.net
[1] http://haystack.lcs.mit.edu


On Tue, 14 Jun 2005, Gary Bader wrote:

> LSID is still cooking, but is not widely accepted, but there are a
> number of people still pushing for it.  IBM is the current caretaker.
> It remains to be seen whether it will be widely adopted by e.g. the
> sequence databases.
>
> http://lsid.sourceforge.net/
>
> Gary
>
> mark.schreiber@novartis.com wrote:
> > Hello -
> >
> > Does anyone know what happened to the Life Science Identifier 
proposal? I
> > notice that there are some classes in biojava to handle it but I'm not
> > sure it was ever widely accepted by the community. Come to think of 
it,
> > does anyone know what happened to the I3C who proposed it?
> >
> > If it's all dead or dying maybe it should be deprecated or removed at 
a
> > later date?
> >
> > - Mark
> >
> > Mark Schreiber
> > Principal Scientist (Bioinformatics)
> >
> > Novartis Institute for Tropical Diseases (NITD)
> > 10 Biopolis Road
> > #05-01 Chromos
> > Singapore 138670
> > www.nitd.novartis.com
> >
> > phone +65 6722 2973
> > fax  +65 6722 2910
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From mark.schreiber at novartis.com  Wed Jun 15 21:36:49 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 15 21:28:40 2005
Subject: [Biojava-l] Phrap output
Message-ID: <OF49BB7E34.2C5AB250-ON48257022.0008C0A0-48257022.0008DD31@EU.novartis.net>

Hi -

The classes in org.biojava.bio.program.phred might do what you need 
although they are more for reading phd files. They may give you a starting 
point though.

- Mark


Avinash Kewalramani <avinash@lanl.gov>
Sent by: biojava-l-bounces@portal.open-bio.org
06/16/2005 03:34 AM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Phrap output


Hi

I need store some information from an ace assembly file(which is Phrap 
plain text output). To do this I will have to write my own parses to 
parse this complicated text file.

Is there any class In bioJava or anywhere else which does this.The best 
scenario would be if some code converts this file to xml output which 
can be easily parsed

I have looked around a bit in Biojava and elsewhere and couldn't find 
anything for this. I dont want to use Perl(BioPerl probably has this)

Thanks

-- 
----------------------------------------------------------------------
Avinash Kewalramani
Technical Lead-Genome Informatics Group
Bioscience Division
Los Alamos National Laboratory 
Los Alamos, NM 87545

Phone: 505-664-0527
Cell:  816-213-1908
E-mail:  avinash@lanl.gov
----------------------------------------------------------------------

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From hollandr at gis.a-star.edu.sg  Wed Jun 15 21:41:43 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Wed Jun 15 21:34:31 2005
Subject: [Biojava-l] Phrap output
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA9E9@BIONIC.biopolis.one-north.com>

Nope, nothing exists yet for reading Phrap/ACE. If you do end up writing
your own parser, it'd be really great if you could contribute it to the
project too.

The way the BioJava file parsers work removes the need for an
XML-translation step. File parsers read file, then fire events to
listeners, eg. you could fire an event that says 'add another sequence',
or one that says 'assembly finished'. The listener uses the events to
construct the appropriate objects. When writing the file back out again
the same events are generated, and another listener receives them and
writes out the corresponding bits of file.

You'd also have to decide how to represent the assembly once it is in
memory. The interface org.biojava.seq.Assembly might be a good starting
point.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Avinash Kewalramani
> Sent: Thursday, June 16, 2005 3:35 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] Phrap output
> 
> 
> Hi
> 
> I need store some information from an ace assembly file(which 
> is Phrap 
> plain text output). To do this I will have to write my own parses to 
> parse this complicated text file.
> 
> Is there any class In bioJava or anywhere else which does 
> this.The best 
> scenario would be if some code converts this file to xml output which 
> can be easily parsed
> 
> I have looked around a bit in Biojava and elsewhere and couldn't find 
> anything for this. I dont want to use Perl(BioPerl probably has this)
> 
> Thanks
> 
> -- 
> ----------------------------------------------------------------------
> Avinash Kewalramani
> Technical Lead-Genome Informatics Group
> Bioscience Division
> Los Alamos National Laboratory 
> Los Alamos, NM 87545
> 
> Phone: 505-664-0527
> Cell:  816-213-1908
> E-mail:  avinash@lanl.gov
> ----------------------------------------------------------------------
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From mark.schreiber at novartis.com  Wed Jun 15 21:51:42 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 15 21:43:36 2005
Subject: [Biojava-l] Phrap output
Message-ID: <OF63DF93AB.69DE492E-ON48257022.000A1159-48257022.000A3A4A@EU.novartis.net>

If your going to follow the event based parsing model (which I strongly 
reccomend you do), I would make a Format implementation (possibly extended 
if you need more methods) and fire your events at something like the 
SimpleAssemblyBuilder object (again possibly extended if you need it to do 
more).

- Mark


"Richard HOLLAND" <hollandr@gis.a-star.edu.sg>
Sent by: biojava-l-bounces@portal.open-bio.org
06/16/2005 09:41 AM

 
        To:     "Avinash Kewalramani" <avinash@lanl.gov>
        cc:     biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis)
        Subject:        RE: [Biojava-l] Phrap output


Nope, nothing exists yet for reading Phrap/ACE. If you do end up writing
your own parser, it'd be really great if you could contribute it to the
project too.

The way the BioJava file parsers work removes the need for an
XML-translation step. File parsers read file, then fire events to
listeners, eg. you could fire an event that says 'add another sequence',
or one that says 'assembly finished'. The listener uses the events to
construct the appropriate objects. When writing the file back out again
the same events are generated, and another listener receives them and
writes out the corresponding bits of file.

You'd also have to decide how to represent the assembly once it is in
memory. The interface org.biojava.seq.Assembly might be a good starting
point.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biojava-l-bounces@portal.open-bio.org 
> [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> Avinash Kewalramani
> Sent: Thursday, June 16, 2005 3:35 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] Phrap output
> 
> 
> Hi
> 
> I need store some information from an ace assembly file(which 
> is Phrap 
> plain text output). To do this I will have to write my own parses to 
> parse this complicated text file.
> 
> Is there any class In bioJava or anywhere else which does 
> this.The best 
> scenario would be if some code converts this file to xml output which 
> can be easily parsed
> 
> I have looked around a bit in Biojava and elsewhere and couldn't find 
> anything for this. I dont want to use Perl(BioPerl probably has this)
> 
> Thanks
> 
> -- 
> ----------------------------------------------------------------------
> Avinash Kewalramani
> Technical Lead-Genome Informatics Group
> Bioscience Division
> Los Alamos National Laboratory 
> Los Alamos, NM 87545
> 
> Phone: 505-664-0527
> Cell:  816-213-1908
> E-mail:  avinash@lanl.gov
> ----------------------------------------------------------------------
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From tmo at ebi.ac.uk  Thu Jun 16 05:45:21 2005
From: tmo at ebi.ac.uk (Tom Oinn)
Date: Thu Jun 16 05:36:38 2005
Subject: [Biojava-l] LSID
In-Reply-To: <OFB7B28C92.D0AA7E59-ON48257022.000887BE-48257022.00089F08@EU.novartis.net>
References: <OFB7B28C92.D0AA7E59-ON48257022.000887BE-48257022.00089F08@EU.novartis.net>
Message-ID: <42B14A31.4080400@ebi.ac.uk>

mark.schreiber@novartis.com wrote:
> The internal use was mine (I was just using it as a substitute for a 
> namespace). Maybe we should upgrade it to be compatable with IBM or 
> Taverna?
> 
> - Mark
> 
> 
> 
> 
> 
> Michael Heuer <heuermh@acm.org>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/16/2005 03:27 AM
> 
>  
>         To:     Gary Bader <bader@cbio.mskcc.org>
>         cc:     biojava-l@open-bio.org, Mark Schreiber/GP/Novartis@PH
>         Subject:        Re: [Biojava-l] LSID
> 
> 
> 
> The biojava LSID and the IBM LSID are slightly different APIs, the IBM one
> the more complete of the two.  There also are/were LSID client
> implementations that I'm not very familiar with in taverna [0] and for
> whatever reason in an email client called Haystack [1].

Taverna and Haystack (not an email client!) both use the reference 
implementation, i.e. the IBM one. In theory though the implementation 
shouldn't be that important, it's a standard after all - I'm not sure 
how actively supported IBM's one is but we've been using it quite 
happily for ages now.

We use LSIDs in a slightly different manner to that originally intended, 
in that we're mostly using them to name transient entities such as 
workflow process instances although we do also name concrete data items.

Cheers,

Tom (Taverna lead)
From fpepin at cs.mcgill.ca  Thu Jun 16 18:55:54 2005
From: fpepin at cs.mcgill.ca (Francois Pepin)
Date: Thu Jun 16 18:48:04 2005
Subject: [Biojava-l] Local binary execution
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA935@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCA935@BIONIC.biopolis.one-north.com>
Message-ID: <1118962554.5239.28.camel@elm.mcb.mcgill.ca>

Not quite true. I've been using linux x86_64 version of the sun jvm 1.5
for a while now.

I do agree that it's limited to windows, linux and solaris (32 and 64
bits for all). I don't know about other jvms.

I personally like 1.5 a lot, but I'm not sure if I'd force it on all
biojava users. Are you talking about core features, or just nifty add-
ons that can be selectively compiled using ant?

Francois

On Wed, 2005-15-06 at 09:36 +0800, Richard HOLLAND wrote:
> Linux supports Java 1.5 but only using the Sun JDK on ia32 and AMD
> Opterons. Support for other architectures on Linux (such as ia64, PPC,
> or Alpha) is restricted to specialist provisions from vendors such as HP
> and the open source efforts such as Blackdown JDK. At a quick check, the
> Alpha is only at 1.4.2 (from HP), likewise PPC (from IBM), whereas ia64
> can run 1.5 apps using HP's JRE but no compiler yet exists for them.
> There may also be some open source purists out there who object when
> they can't use their favourite open source JDK any more...
> 
> 
> Richard Holland
> Bioinformatics Specialist
> GIS extension 8199
> ---------------------------------------------
> This email is confidential and may be privileged. If you are not the
> intended recipient, please delete it and notify us immediately. Please
> do not copy or use it for any purpose, or disclose its content to any
> other person. Thank you.
> ---------------------------------------------
> 
> 
> > -----Original Message-----
> > From: biojava-l-bounces@portal.open-bio.org 
> > [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of 
> > mark.schreiber@novartis.com
> > Sent: Wednesday, June 15, 2005 9:09 AM
> > To: Michael Barton
> > Cc: biojava-l@open-bio.org; BioJava-dev
> > Subject: [Biojava-l] Re: [Biojava-dev] Local binary execution
> > 
> > 
> > We would normally not like to use a new JDK in biojava unless 
> > it is well 
> > supported on all the OS's people are using. Having said that 
> > there are 
> > several attractive features which would make it nice to use.
> > 
> > Is anyones current OS not supporting java 1.5?
> > 
> > - Mark 
> > 
> > 
> > 
> > 
> > 
> > Michael Barton <michael.barton1@ncl.ac.uk>
> > Sent by: biojava-dev-bounces@portal.open-bio.org
> > 06/15/2005 02:26 AM
> > 
> >  
> >         To:     BioJava-dev <biojava-dev@biojava.org>
> >         cc:     (bcc: Mark Schreiber/GP/Novartis)
> >         Subject:        Re: [Biojava-dev] Local binary execution
> > 
> > 
> > I had a look at the post you were refering to. In terms of the ant
> > support for local binary execution I think it is very similar to the
> > newly implemented ProcessBuilder in Java 1.5.
> > This class has a similar way way of adding command line arguments to
> > that of ant <exec>.
> > 
> > The classes I'm suggesting have an enum of arguments specific to the
> > application which may convienient for suppling different 
> > switch/argument
> > pairs, as it it means that only arguments for which the binary allows
> > can be supplied.
> > Any errors should therefore come from incorrent argument values rather
> > than incorrect arguments. If that makes sense.
> > In addition the class throws an exception if the essential arguments
> > required to run the search are not supplied.
> > 
> > This means however that the classes are written in Java 1.5. 
> > Would this
> > be a problem?
> > 
> > 
> > On Thu, 2005-06-09 at 11:54 -0400, Michael Heuer wrote:
> > > Hello Michael,
> > > 
> > > Personally I think this kind of code might be better suited 
> > in a more
> > > general library, say in an Apache Jakarta Commons project 
> > for example.
> > > 
> > > In fact, there was just a proposal to pull the exec code 
> > out of ant into 
> > a
> > > separate self-contained library to the commonds-dev mailing list a 
> > couple
> > > of days ago:
> > > 
> > > > http://tinyurl.com/9culs
> > > 
> > > That said, this comes up quite frequently here, so perhaps 
> > we should 
> > just
> > > bite the bullet and do it up right.
> > > 
> > >    michael
> > > 
> > > 
> > > On Thu, 9 Jun 2005, Michael Barton wrote:
> > > 
> > > >
> > > > Hi,
> > > >
> > > > I'm Bioinformatics MRes student at Newcastle. I've been 
> > messing around
> > > > with some java code to execute bioinformatics binaries. It was
> > > > originally intended for blast but has also been extended 
> > for genewise.
> > > > It takes the hassle out of using process / process 
> > builder a little 
> > bit.
> > > >
> > > > Use goes along the lines of something like this
> > > >
> > > > //Search factory for creating searches
> > > > SearchFactory<BlastSearch, BlastSearchFactory.Parameter> bsf;
> > > > bsf = new BlastSearchFactory();
> > > >
> > > > //Paramterise with search specific variables
> > > > bsf.setSearchBinaryLocation(test_data + "/blast/binary");
> > > > 
> > bsf.setSearchParameter(BlastSearchFactory.Parameter.blastType,
> > "blastn");
> > > > bsf.setSearchParameter(BlastSearchFactory.Parameter.database,
> > > >     test_data + "/blast/db/sargasso");
> > > >
> > > > //Create immutable search object which can be used to run mutiple
> > > > searches on the same database
> > > > Search<BlastSearchResult> blastSearch = bsf.getSearch();
> > > >
> > > > Simple search result object which returns inputstream
> > > > SearchResult sr = blastSearch.execute(new File(test_data +
> > > > "/blast/query/query"));
> > > >
> > > > InputStream is = sr.getResultStream();
> > > >
> > > > It's seems to work okay on linux, I haven't tested it on windows.
> > > >
> > > > There's a little bit of JavaDoc I started work on but 
> > it's a little 
> > bit
> > > > messed up from where I've been changing things around.
> > > >
> > > > The source/jar/doc are all here. There's test cases too.
> > > >
> > > > http://www.students.ncl.ac.uk/michael.barton1/
> > > >
> > > > Mike
> > > >
> > > > _______________________________________________
> > > > biojava-dev mailing list
> > > > biojava-dev@biojava.org
> > > > http://biojava.org/mailman/listinfo/biojava-dev
> > > >
> > > 
> > 
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-dev
> > 
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 

From great_fred at yahoo.com  Fri Jun 17 08:01:10 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Fri Jun 17 07:53:03 2005
Subject: [Biojava-l] Parse with HSPHandler ??
Message-ID: <20050617120110.6119.qmail@web32208.mail.mud.yahoo.com>

Hi everybody...

I try to understand how Biojava works and I have a lot of problem...
Maybe because I'm new in Java and Biojava....

I have files from blast programs of NCBI....
I can get them in text or XML format....
But, my wish is to keep just the aligments sequences and the name of
the protein of each sequence...

I tried to use HspHandler class and the example "BlastParser", given by
Mark Schreiber, but I haven't what I want...And I don't know anymore
how I can do...

If it's not clear, I can try to better explain...Ask me....
(Because I'm French and not very good in English...;);) )

Thank you for any answer..

Sebastien


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From mark.schreiber at novartis.com  Sun Jun 19 21:01:29 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Sun Jun 19 20:53:03 2005
Subject: [Biojava-l] Parse with HSPHandler ??
Message-ID: <OFC36B08B3.4544A3E3-ON48257026.00059742-48257026.0005A183@EU.novartis.net>

What is it that you want from the BLAST record that you are not getting?

- Mark


S?bastien PETIT <great_fred@yahoo.com>
Sent by: biojava-l-bounces@portal.open-bio.org
06/17/2005 08:01 PM

 
        To:     biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] Parse with HSPHandler ??


Hi everybody...

I try to understand how Biojava works and I have a lot of problem...
Maybe because I'm new in Java and Biojava....

I have files from blast programs of NCBI....
I can get them in text or XML format....
But, my wish is to keep just the aligments sequences and the name of
the protein of each sequence...

I tried to use HspHandler class and the example "BlastParser", given by
Mark Schreiber, but I haven't what I want...And I don't know anymore
how I can do...

If it's not clear, I can try to better explain...Ask me....
(Because I'm French and not very good in English...;);) )

Thank you for any answer..

Sebastien


___________________________________________________________________________ 

Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 

T?l?chargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From boehme at mpiib-berlin.mpg.de  Mon Jun 20 05:43:35 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun 20 05:39:35 2005
Subject: [Biojava-l] _removeSequence
Message-ID: <42B68FC7.3060102@mpiib-berlin.mpg.de>

Hi,

Im trying to delete a sequence and recursivly all its features.

So:

for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
	Sequence s = si.nextSequence();
	String name = s.getName();
	s = null;
	db.removeSequence(name);
}

But if I look in the database (MySQL  4.1.12) I can still see plenty 
of entries and I have problems entering the same features again, 
because of dublicate key error. I would like to know if 
_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
features recursivly or just the features of the removed sequence?
If so - what is the best way do delete the features of the features 
(and so on)? And how to empty the db completly?

Martina

From mark.schreiber at novartis.com  Mon Jun 20 05:56:40 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Jun 20 05:48:18 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
Message-ID: <OF99C2BEB9.F7056E19-ON48257026.0036755E-48257026.0036A100@EU.novartis.net>

Biojava doesn't attempt to recusivley remove features by itself. It relies 
on cascading deletes in the database. I know Oracle can be set to do this 
(and it works very well). If MySQL has equivalent functionality you may 
need to turn it on. I'm pretty sure it does but you need to set it up.

- Mark


Martina <boehme@mpiib-berlin.mpg.de>
Sent by: biosql-l-bounces@portal.open-bio.org
06/20/2005 05:43 PM

 
        To:     biosql-l@open-bio.org, BioJava <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [BioSQL-l] _removeSequence


Hi,

Im trying to delete a sequence and recursivly all its features.

So:

for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
                 Sequence s = si.nextSequence();
                 String name = s.getName();
                 s = null;
                 db.removeSequence(name);
}

But if I look in the database (MySQL  4.1.12) I can still see plenty 
of entries and I have problems entering the same features again, 
because of dublicate key error. I would like to know if 
_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
features recursivly or just the features of the removed sequence?
If so - what is the best way do delete the features of the features 
(and so on)? And how to empty the db completly?

Martina

_______________________________________________
BioSQL-l mailing list
BioSQL-l@open-bio.org
http://open-bio.org/mailman/listinfo/biosql-l


From mark.schreiber at novartis.com  Mon Jun 20 06:06:32 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Jun 20 05:58:14 2005
Subject: [Biojava-l] _removeSequence
Message-ID: <OF292A23AB.EA498551-ON48257026.00373EE8-48257026.00378820@EU.novartis.net>

To remove the database completely (while still keeping the tables etc) you 
would again need to turn on cascading deletes and delete the appropriate 
biodatabase row from the biodatabase table (or all of them if you have 
more than one).

You cannot currently do this using the biojava interface. You would need 
to code a JDBC statement to do it for you, or connect to the DB and issue 
the SQL statement yourself.

- Mark


Martina <boehme@mpiib-berlin.mpg.de>
Sent by: biojava-l-bounces@portal.open-bio.org
06/20/2005 05:43 PM

 
        To:     biosql-l@open-bio.org, BioJava <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] _removeSequence


Hi,

Im trying to delete a sequence and recursivly all its features.

So:

for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
                 Sequence s = si.nextSequence();
                 String name = s.getName();
                 s = null;
                 db.removeSequence(name);
}

But if I look in the database (MySQL  4.1.12) I can still see plenty 
of entries and I have problems entering the same features again, 
because of dublicate key error. I would like to know if 
_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
features recursivly or just the features of the removed sequence?
If so - what is the best way do delete the features of the features 
(and so on)? And how to empty the db completly?

Martina

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From hollandr at gis.a-star.edu.sg  Mon Jun 20 06:10:29 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Mon Jun 20 06:03:38 2005
Subject: [Biojava-l] RE: [BioSQL-l] _removeSequence
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>

To do cascading deletes in MySQL requires the tables to have been set up
using the InnoDB table style (as opposed to the default MyISAM tables).
In InnoDB, foreign keys are actually enforced and deletes will cascade,
whereas in MyISAM it has no concept of foreign keys and so is unable to
enforce data integrity. The people on the BioSQL-L mailing list will be
able to help you there.

The next version of BioJava's database interfaces after the 1.4 release
will assume that the underlying database does have cascading deletes
turned on. The existing version half-attempts to make up for the lack of
cascading deletes in databases that don't support it, but it doesn't do
it well at all, hence the problems you are seeing. After consulting with
Hilmar last week we decided it was a fair assumption to make that all
BioSQL instances are installed with cascading deletes enabled.
BioPerl-db already makes this assumption.

cheers,
Richard

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biosql-l-bounces@portal.open-bio.org 
> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
> mark.schreiber@novartis.com
> Sent: Monday, June 20, 2005 5:57 PM
> To: Martina
> Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
> biosql-l@open-bio.org
> Subject: Re: [BioSQL-l] _removeSequence
> 
> 
> Biojava doesn't attempt to recusivley remove features by 
> itself. It relies 
> on cascading deletes in the database. I know Oracle can be 
> set to do this 
> (and it works very well). If MySQL has equivalent 
> functionality you may 
> need to turn it on. I'm pretty sure it does but you need to set it up.
> 
> - Mark
> 
> 
> 
> 
> 
> Martina <boehme@mpiib-berlin.mpg.de>
> Sent by: biosql-l-bounces@portal.open-bio.org
> 06/20/2005 05:43 PM
> 
>  
>         To:     biosql-l@open-bio.org, BioJava <biojava-l@biojava.org>
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        [BioSQL-l] _removeSequence
> 
> 
> Hi,
> 
> Im trying to delete a sequence and recursivly all its features.
> 
> So:
> 
> for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
>                  Sequence s = si.nextSequence();
>                  String name = s.getName();
>                  s = null;
>                  db.removeSequence(name);
> }
> 
> But if I look in the database (MySQL  4.1.12) I can still see plenty 
> of entries and I have problems entering the same features again, 
> because of dublicate key error. I would like to know if 
> _removeSequence(String) in BioSQLSequenceDB is supposed to remove 
> features recursivly or just the features of the removed sequence?
> If so - what is the best way do delete the features of the features 
> (and so on)? And how to empty the db completly?
> 
> Martina
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
> 
> 
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
> 

From hollandr at gis.a-star.edu.sg  Mon Jun 20 06:11:57 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Mon Jun 20 06:05:31 2005
Subject: [BioSQL-l] Re: [Biojava-l] _removeSequence
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB7A@BIONIC.biopolis.one-north.com>

There is also the BS-zap-all script in the BioSQL distribution which
will wipe the whole lot for you in one go. :)

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: biosql-l-bounces@portal.open-bio.org 
> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
> mark.schreiber@novartis.com
> Sent: Monday, June 20, 2005 6:07 PM
> To: Martina
> Cc: biojava-l-bounces@portal.open-bio.org; BioJava; 
> biosql-l@open-bio.org
> Subject: [BioSQL-l] Re: [Biojava-l] _removeSequence
> 
> 
> To remove the database completely (while still keeping the 
> tables etc) you 
> would again need to turn on cascading deletes and delete the 
> appropriate 
> biodatabase row from the biodatabase table (or all of them if 
> you have 
> more than one).
> 
> You cannot currently do this using the biojava interface. You 
> would need 
> to code a JDBC statement to do it for you, or connect to the 
> DB and issue 
> the SQL statement yourself.
> 
> - Mark
> 
> 
> 
> 
> 
> Martina <boehme@mpiib-berlin.mpg.de>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/20/2005 05:43 PM
> 
>  
>         To:     biosql-l@open-bio.org, BioJava <biojava-l@biojava.org>
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        [Biojava-l] _removeSequence
> 
> 
> Hi,
> 
> Im trying to delete a sequence and recursivly all its features.
> 
> So:
> 
> for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
>                  Sequence s = si.nextSequence();
>                  String name = s.getName();
>                  s = null;
>                  db.removeSequence(name);
> }
> 
> But if I look in the database (MySQL  4.1.12) I can still see plenty 
> of entries and I have problems entering the same features again, 
> because of dublicate key error. I would like to know if 
> _removeSequence(String) in BioSQLSequenceDB is supposed to remove 
> features recursivly or just the features of the removed sequence?
> If so - what is the best way do delete the features of the features 
> (and so on)? And how to empty the db completly?
> 
> Martina
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
> 

From boehme at mpiib-berlin.mpg.de  Mon Jun 20 06:20:37 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun 20 06:24:35 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
Message-ID: <42B69875.3050306@mpiib-berlin.mpg.de>

My tables are all InnoDB tables and in the biosqldb-mysql.sql (v 1.40 
2004/11/04 01:49:41) which created them, it says ON DELETE CASCADE.
Do I need to do anything else?

Thanks,
Martina

Richard HOLLAND wrote:

> To do cascading deletes in MySQL requires the tables to have been set up
> using the InnoDB table style (as opposed to the default MyISAM tables).
> In InnoDB, foreign keys are actually enforced and deletes will cascade,
> whereas in MyISAM it has no concept of foreign keys and so is unable to
> enforce data integrity. The people on the BioSQL-L mailing list will be
> able to help you there.
> 
> The next version of BioJava's database interfaces after the 1.4 release
> will assume that the underlying database does have cascading deletes
> turned on. The existing version half-attempts to make up for the lack of
> cascading deletes in databases that don't support it, but it doesn't do
> it well at all, hence the problems you are seeing. After consulting with
> Hilmar last week we decided it was a fair assumption to make that all
> BioSQL instances are installed with cascading deletes enabled.
> BioPerl-db already makes this assumption.
> 
> cheers,
> Richard
> 
> Richard Holland
> Bioinformatics Specialist
> GIS extension 8199
> ---------------------------------------------
> This email is confidential and may be privileged. If you are not the
> intended recipient, please delete it and notify us immediately. Please
> do not copy or use it for any purpose, or disclose its content to any
> other person. Thank you.
> ---------------------------------------------
> 
> 
> 
>>-----Original Message-----
>>From: biosql-l-bounces@portal.open-bio.org 
>>[mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
>>mark.schreiber@novartis.com
>>Sent: Monday, June 20, 2005 5:57 PM
>>To: Martina
>>Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
>>biosql-l@open-bio.org
>>Subject: Re: [BioSQL-l] _removeSequence
>>
>>
>>Biojava doesn't attempt to recusivley remove features by 
>>itself. It relies 
>>on cascading deletes in the database. I know Oracle can be 
>>set to do this 
>>(and it works very well). If MySQL has equivalent 
>>functionality you may 
>>need to turn it on. I'm pretty sure it does but you need to set it up.
>>
>>- Mark
>>
>>
>>
>>
>>
>>Martina <boehme@mpiib-berlin.mpg.de>
>>Sent by: biosql-l-bounces@portal.open-bio.org
>>06/20/2005 05:43 PM
>>
>> 
>>        To:     biosql-l@open-bio.org, BioJava <biojava-l@biojava.org>
>>        cc:     (bcc: Mark Schreiber/GP/Novartis)
>>        Subject:        [BioSQL-l] _removeSequence
>>
>>
>>Hi,
>>
>>Im trying to delete a sequence and recursivly all its features.
>>
>>So:
>>
>>for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
>>                 Sequence s = si.nextSequence();
>>                 String name = s.getName();
>>                 s = null;
>>                 db.removeSequence(name);
>>}
>>
>>But if I look in the database (MySQL  4.1.12) I can still see plenty 
>>of entries and I have problems entering the same features again, 
>>because of dublicate key error. I would like to know if 
>>_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
>>features recursivly or just the features of the removed sequence?
>>If so - what is the best way do delete the features of the features 
>>(and so on)? And how to empty the db completly?
>>
>>Martina
>>
>>_______________________________________________
>>BioSQL-l mailing list
>>BioSQL-l@open-bio.org
>>http://open-bio.org/mailman/listinfo/biosql-l
>>
>>
>>
>>_______________________________________________
>>BioSQL-l mailing list
>>BioSQL-l@open-bio.org
>>http://open-bio.org/mailman/listinfo/biosql-l
> 
> 
From hollandr at gis.a-star.edu.sg  Mon Jun 20 06:33:02 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Mon Jun 20 06:26:17 2005
Subject: [Biojava-l] RE: [BioSQL-l] _removeSequence
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB80@BIONIC.biopolis.one-north.com>

Well, technically that should work because BioJava simply issues a
delete against the seqfeature table, and therefore all features related
through foreign keys should automatically delete themselves as a result
without any further intervention by BioJava... beats me why it doesn't!
Unfortunately I don't currently use the MySQL implementation myself so I
can't help much. I hope someone on BioSQL-L knows a little more?

Richard Holland
Bioinformatics Specialist
GIS extension 8199
---------------------------------------------
This email is confidential and may be privileged. If you are not the
intended recipient, please delete it and notify us immediately. Please
do not copy or use it for any purpose, or disclose its content to any
other person. Thank you.
---------------------------------------------


> -----Original Message-----
> From: Martina [mailto:boehme@mpiib-berlin.mpg.de] 
> Sent: Monday, June 20, 2005 6:21 PM
> To: Richard HOLLAND
> Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
> biosql-l@open-bio.org
> Subject: Re: [BioSQL-l] _removeSequence
> 
> 
> My tables are all InnoDB tables and in the biosqldb-mysql.sql (v 1.40 
> 2004/11/04 01:49:41) which created them, it says ON DELETE CASCADE.
> Do I need to do anything else?
> 
> Thanks,
> Martina
> 
> Richard HOLLAND wrote:
> 
> > To do cascading deletes in MySQL requires the tables to 
> have been set up
> > using the InnoDB table style (as opposed to the default 
> MyISAM tables).
> > In InnoDB, foreign keys are actually enforced and deletes 
> will cascade,
> > whereas in MyISAM it has no concept of foreign keys and so 
> is unable to
> > enforce data integrity. The people on the BioSQL-L mailing 
> list will be
> > able to help you there.
> > 
> > The next version of BioJava's database interfaces after the 
> 1.4 release
> > will assume that the underlying database does have cascading deletes
> > turned on. The existing version half-attempts to make up 
> for the lack of
> > cascading deletes in databases that don't support it, but 
> it doesn't do
> > it well at all, hence the problems you are seeing. After 
> consulting with
> > Hilmar last week we decided it was a fair assumption to 
> make that all
> > BioSQL instances are installed with cascading deletes enabled.
> > BioPerl-db already makes this assumption.
> > 
> > cheers,
> > Richard
> > 
> > Richard Holland
> > Bioinformatics Specialist
> > GIS extension 8199
> > ---------------------------------------------
> > This email is confidential and may be privileged. If you are not the
> > intended recipient, please delete it and notify us 
> immediately. Please
> > do not copy or use it for any purpose, or disclose its 
> content to any
> > other person. Thank you.
> > ---------------------------------------------
> > 
> > 
> > 
> >>-----Original Message-----
> >>From: biosql-l-bounces@portal.open-bio.org 
> >>[mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
> >>mark.schreiber@novartis.com
> >>Sent: Monday, June 20, 2005 5:57 PM
> >>To: Martina
> >>Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
> >>biosql-l@open-bio.org
> >>Subject: Re: [BioSQL-l] _removeSequence
> >>
> >>
> >>Biojava doesn't attempt to recusivley remove features by 
> >>itself. It relies 
> >>on cascading deletes in the database. I know Oracle can be 
> >>set to do this 
> >>(and it works very well). If MySQL has equivalent 
> >>functionality you may 
> >>need to turn it on. I'm pretty sure it does but you need to 
> set it up.
> >>
> >>- Mark
> >>
> >>
> >>
> >>
> >>
> >>Martina <boehme@mpiib-berlin.mpg.de>
> >>Sent by: biosql-l-bounces@portal.open-bio.org
> >>06/20/2005 05:43 PM
> >>
> >> 
> >>        To:     biosql-l@open-bio.org, BioJava 
> <biojava-l@biojava.org>
> >>        cc:     (bcc: Mark Schreiber/GP/Novartis)
> >>        Subject:        [BioSQL-l] _removeSequence
> >>
> >>
> >>Hi,
> >>
> >>Im trying to delete a sequence and recursivly all its features.
> >>
> >>So:
> >>
> >>for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
> >>                 Sequence s = si.nextSequence();
> >>                 String name = s.getName();
> >>                 s = null;
> >>                 db.removeSequence(name);
> >>}
> >>
> >>But if I look in the database (MySQL  4.1.12) I can still 
> see plenty 
> >>of entries and I have problems entering the same features again, 
> >>because of dublicate key error. I would like to know if 
> >>_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
> >>features recursivly or just the features of the removed sequence?
> >>If so - what is the best way do delete the features of the features 
> >>(and so on)? And how to empty the db completly?
> >>
> >>Martina
> >>
> >>_______________________________________________
> >>BioSQL-l mailing list
> >>BioSQL-l@open-bio.org
> >>http://open-bio.org/mailman/listinfo/biosql-l
> >>
> >>
> >>
> >>_______________________________________________
> >>BioSQL-l mailing list
> >>BioSQL-l@open-bio.org
> >>http://open-bio.org/mailman/listinfo/biosql-l
> > 
> > 
> 

From boehme at mpiib-berlin.mpg.de  Mon Jun 20 09:11:25 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun 20 09:05:22 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB80@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB80@BIONIC.biopolis.one-north.com>
Message-ID: <42B6C07D.7000106@mpiib-berlin.mpg.de>

I droped the db and run the bioSql again - looks like its working now!
Must have stopped before the alter table statements - didn't had the 
foreign keys - but I didn't know, that they had to be there.
Thanks!

Richard HOLLAND wrote:

> Well, technically that should work because BioJava simply issues a
> delete against the seqfeature table, and therefore all features related
> through foreign keys should automatically delete themselves as a result
> without any further intervention by BioJava... beats me why it doesn't!
> Unfortunately I don't currently use the MySQL implementation myself so I
> can't help much. I hope someone on BioSQL-L knows a little more?
> 
> Richard Holland
> Bioinformatics Specialist
> GIS extension 8199
> ---------------------------------------------
> This email is confidential and may be privileged. If you are not the
> intended recipient, please delete it and notify us immediately. Please
> do not copy or use it for any purpose, or disclose its content to any
> other person. Thank you.
> ---------------------------------------------
> 
> 
> 
>>-----Original Message-----
>>From: Martina [mailto:boehme@mpiib-berlin.mpg.de] 
>>Sent: Monday, June 20, 2005 6:21 PM
>>To: Richard HOLLAND
>>Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
>>biosql-l@open-bio.org
>>Subject: Re: [BioSQL-l] _removeSequence
>>
>>
>>My tables are all InnoDB tables and in the biosqldb-mysql.sql (v 1.40 
>>2004/11/04 01:49:41) which created them, it says ON DELETE CASCADE.
>>Do I need to do anything else?
>>
>>Thanks,
>>Martina
>>
>>Richard HOLLAND wrote:
>>
>>
>>>To do cascading deletes in MySQL requires the tables to 
>>
>>have been set up
>>
>>>using the InnoDB table style (as opposed to the default 
>>
>>MyISAM tables).
>>
>>>In InnoDB, foreign keys are actually enforced and deletes 
>>
>>will cascade,
>>
>>>whereas in MyISAM it has no concept of foreign keys and so 
>>
>>is unable to
>>
>>>enforce data integrity. The people on the BioSQL-L mailing 
>>
>>list will be
>>
>>>able to help you there.
>>>
>>>The next version of BioJava's database interfaces after the 
>>
>>1.4 release
>>
>>>will assume that the underlying database does have cascading deletes
>>>turned on. The existing version half-attempts to make up 
>>
>>for the lack of
>>
>>>cascading deletes in databases that don't support it, but 
>>
>>it doesn't do
>>
>>>it well at all, hence the problems you are seeing. After 
>>
>>consulting with
>>
>>>Hilmar last week we decided it was a fair assumption to 
>>
>>make that all
>>
>>>BioSQL instances are installed with cascading deletes enabled.
>>>BioPerl-db already makes this assumption.
>>>
>>>cheers,
>>>Richard
>>>
>>>Richard Holland
>>>Bioinformatics Specialist
>>>GIS extension 8199
>>>---------------------------------------------
>>>This email is confidential and may be privileged. If you are not the
>>>intended recipient, please delete it and notify us 
>>
>>immediately. Please
>>
>>>do not copy or use it for any purpose, or disclose its 
>>
>>content to any
>>
>>>other person. Thank you.
>>>---------------------------------------------
>>>
>>>
>>>
>>>
>>>>-----Original Message-----
>>>>From: biosql-l-bounces@portal.open-bio.org 
>>>>[mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of 
>>>>mark.schreiber@novartis.com
>>>>Sent: Monday, June 20, 2005 5:57 PM
>>>>To: Martina
>>>>Cc: biosql-l-bounces@portal.open-bio.org; BioJava; 
>>>>biosql-l@open-bio.org
>>>>Subject: Re: [BioSQL-l] _removeSequence
>>>>
>>>>
>>>>Biojava doesn't attempt to recusivley remove features by 
>>>>itself. It relies 
>>>>on cascading deletes in the database. I know Oracle can be 
>>>>set to do this 
>>>>(and it works very well). If MySQL has equivalent 
>>>>functionality you may 
>>>>need to turn it on. I'm pretty sure it does but you need to 
>>
>>set it up.
>>
>>>>- Mark
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>Martina <boehme@mpiib-berlin.mpg.de>
>>>>Sent by: biosql-l-bounces@portal.open-bio.org
>>>>06/20/2005 05:43 PM
>>>>
>>>>
>>>>       To:     biosql-l@open-bio.org, BioJava 
>>
>><biojava-l@biojava.org>
>>
>>>>       cc:     (bcc: Mark Schreiber/GP/Novartis)
>>>>       Subject:        [BioSQL-l] _removeSequence
>>>>
>>>>
>>>>Hi,
>>>>
>>>>Im trying to delete a sequence and recursivly all its features.
>>>>
>>>>So:
>>>>
>>>>for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
>>>>                Sequence s = si.nextSequence();
>>>>                String name = s.getName();
>>>>                s = null;
>>>>                db.removeSequence(name);
>>>>}
>>>>
>>>>But if I look in the database (MySQL  4.1.12) I can still 
>>
>>see plenty 
>>
>>>>of entries and I have problems entering the same features again, 
>>>>because of dublicate key error. I would like to know if 
>>>>_removeSequence(String) in BioSQLSequenceDB is supposed to remove 
>>>>features recursivly or just the features of the removed sequence?
>>>>If so - what is the best way do delete the features of the features 
>>>>(and so on)? And how to empty the db completly?
>>>>
>>>>Martina
>>>>
>>>>_______________________________________________
>>>>BioSQL-l mailing list
>>>>BioSQL-l@open-bio.org
>>>>http://open-bio.org/mailman/listinfo/biosql-l
>>>>
>>>>
>>>>
>>>>_______________________________________________
>>>>BioSQL-l mailing list
>>>>BioSQL-l@open-bio.org
>>>>http://open-bio.org/mailman/listinfo/biosql-l
>>>
>>>
> 
From boehme at mpiib-berlin.mpg.de  Mon Jun 20 11:20:35 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Mon Jun 20 11:20:27 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
Message-ID: <42B6DEC3.9090807@mpiib-berlin.mpg.de>

Hi,

so I have this new database (still biosqldb-mysql.sqlv 1.40 2004/11/04 
01:49:41) and after removing all sequences, I do still have entries in 
term, term_relationship,term_relationship_term and ontology. And of 
course, in biodatabase. If I delete the entry in biodatabase too, 
nothing changes. Is that what is to be expected?
Cause I still have trouble with the dublicate entry key, but that must 
be my code then.

Thanks
Martina
From jesse-t at chello.nl  Mon Jun 20 19:36:25 2005
From: jesse-t at chello.nl (Jesse)
Date: Mon Jun 20 19:28:12 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <20050620233623.ZBAZ1226.amsfep20-int.chello.nl@anonymous>

I found some strange things when using RestrictionEnzymeManager to get
Restriction Enzymes (RE's) from REBASE.

When I change a specific name of a RE in the REBASE file, it will crash.

Steps to prepare:
-Download the latest REBASE file from
http://rebase.neb.com/rebase/link_withrefm
-Rename it to rebase_common.dat
-Overwrite it on the default smaller rebase_common.dat which is in the
BioJava classpath org/biojava/bio/molbio/

For example:
When I change
<1>XmnI
in
<1>XmbbnI
It will crash. When I change it back again (using the same texteditor), it
will work again.

Is the RE name of the "<1>" section linked to other sections like "<2>"? And
then sees that a RE name is missing?

Another strange thing is that when I remove some RE enties (so from <1> to
<8> including the empty separator line after it), it will crash. Even though
hexeditors show that only the entry is removed and not some newline
characters etc. So the format is still the same.

Does somebody know how these problems are caused? Or did I do something
wrong?

Thanks,

Jesse


-------- Error ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
	at RETools.printAllRE(RETools.java:32)
	at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
	at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
	at org.biojava.utils.SmallSet.add(SmallSet.java:81)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
	... 3 more
-------------------------

From mark.schreiber at novartis.com  Mon Jun 20 21:35:09 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Jun 20 21:27:27 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <OF925ECD5E.D0806B4E-ON48257027.000867BB-48257027.0008B6CB@EU.novartis.net>

Hi -

When you say crash do you mean blue-screen-of-death type crash, 
chernobyl-type-meltdown or just a throws-an-exception and exits? If the 
latter please paste in your stack trace so we can figure out what 
happened. 

Also your JVM, OS, BioJava version would be good. Please also make sure 
you are using the latest biojava version (1.4pre2).

Thanks,

- Mark


"Jesse" <jesse-t@chello.nl>
Sent by: biojava-l-bounces@portal.open-bio.org
06/21/2005 07:36 AM

 
        To:     <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzymeManager REBASE reader bug?


I found some strange things when using RestrictionEnzymeManager to get
Restriction Enzymes (RE's) from REBASE.

When I change a specific name of a RE in the REBASE file, it will crash.

Steps to prepare:
-Download the latest REBASE file from
http://rebase.neb.com/rebase/link_withrefm
-Rename it to rebase_common.dat
-Overwrite it on the default smaller rebase_common.dat which is in the
BioJava classpath org/biojava/bio/molbio/

For example:
When I change
<1>XmnI
in
<1>XmbbnI
It will crash. When I change it back again (using the same texteditor), it
will work again.

Is the RE name of the "<1>" section linked to other sections like "<2>"? 
And
then sees that a RE name is missing?

Another strange thing is that when I remove some RE enties (so from <1> to
<8> including the empty separator line after it), it will crash. Even 
though
hexeditors show that only the entry is removed and not some newline
characters etc. So the format is still the same.

Does somebody know how these problems are caused? Or did I do something
wrong?

Thanks,

Jesse


-------- Error ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
                 at RETools.printAllRE(RETools.java:32)
                 at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
                 at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
                 at org.biojava.utils.SmallSet.add(SmallSet.java:81)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
                 ... 3 more
-------------------------

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From jesse-t at chello.nl  Mon Jun 20 23:39:56 2005
From: jesse-t at chello.nl (Jesse)
Date: Mon Jun 20 23:31:33 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <20050621033955.JXSL1231.amsfep18-int.chello.nl@anonymous>

Hi Mark,

With "crash" I mean an exception.

This one:
-------- Exception  ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
	at RETools.printAllRE(RETools.java:32)
	at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
	at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
	at org.biojava.utils.SmallSet.add(SmallSet.java:81)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
	... 3 more
-------------------------

OS: Microsoft Windows XP Professional SP 2 [Version 5.1.2600]
Java:
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)
BioJava: BioJava 1.4pre2

The problem I described happens when calling
RestrictionEnzymeManager.getAllEnzymes() on a modified REBASE file.
What I modified (as test) was only the name "<1>XmnI" to "<1>XmbnI" (line
35343 of REBASE format 31, version 506).

Sometimes the exception also occurs when removing some specific restriction
enzyme entries (from <1> to <8> including the trailing empty line).


From jesse-t at chello.nl  Mon Jun 20 23:41:36 2005
From: jesse-t at chello.nl (Jesse)
Date: Mon Jun 20 23:33:06 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <20050621034134.XZXC1610.amsfep12-int.chello.nl@anonymous>

Hi Mark,

With "crash" I mean an exception.

This one:
-------- Exception  ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
	at RETools.printAllRE(RETools.java:32)
	at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
	at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
	at org.biojava.utils.SmallSet.add(SmallSet.java:81)
	at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
	... 3 more
-------------------------

OS: Microsoft Windows XP Professional SP 2 [Version 5.1.2600]
Java:
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)
BioJava: BioJava 1.4pre2

The problem I described happens when calling
RestrictionEnzymeManager.getAllEnzymes() on a modified REBASE file.
What I modified (as test) was only the name "<1>XmnI" to "<1>XmbnI" (line
35343 of REBASE format 31, version 506).

Sometimes the exception also occurs when removing some specific restriction
enzyme entries (from <1> to <8> including the trailing empty line).


From mark.schreiber at novartis.com  Tue Jun 21 01:45:49 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 21 01:37:19 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <OFD454BB21.D398DC03-ON48257027.001F54EF-48257027.001FA9CE@EU.novartis.net>

I guess the other question I would ask is, should it crash?

By modifying the file are you fundamentally changing the format? The 
NullPointerException seems to suggest that you inserted something that 
doesn't have a matching record (or some similar problem, I'm not familiar 
with REBASE).

Try taking a look into the RestrictionEnzymeManager code at the root of 
the exception to give you some clues what might be going wrong. It's hard 
to tell if this is actually a bug or if you have incorrectly modified the 
file.

- Mark


"Jesse" <jesse-t@chello.nl>
Sent by: biojava-l-bounces@portal.open-bio.org
06/21/2005 11:39 AM

 
        To:     <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzymeManager REBASE reader bug?


Hi Mark,

With "crash" I mean an exception.

This one:
-------- Exception  ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
                 at RETools.printAllRE(RETools.java:32)
                 at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
                 at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
                 at org.biojava.utils.SmallSet.add(SmallSet.java:81)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
                 ... 3 more
-------------------------

OS: Microsoft Windows XP Professional SP 2 [Version 5.1.2600]
Java:
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)
BioJava: BioJava 1.4pre2

The problem I described happens when calling
RestrictionEnzymeManager.getAllEnzymes() on a modified REBASE file.
What I modified (as test) was only the name "<1>XmnI" to "<1>XmbnI" (line
35343 of REBASE format 31, version 506).

Sometimes the exception also occurs when removing some specific 
restriction
enzyme entries (from <1> to <8> including the trailing empty line).


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From franckv at ebi.ac.uk  Tue Jun 21 03:50:26 2005
From: franckv at ebi.ac.uk (Franck Valentin)
Date: Tue Jun 21 03:41:54 2005
Subject: [Biojava-l] Use of 'LabelledSequenceRenderer' and
	'FeatureLabelRenderer'
Message-ID: <1119340226.12636.2185.camel@pongo.ebi.ac.uk>

Hi,

I would like to display graphically a feature table in the usual form
like this one :

              label1            label2
feature1     <--------->       <--------->

                 label3         label4
feature2        <---------->   <-------------->

.....

I've adapted FastBeadDemo.java and tried to use the class
'LabelledSequenceRenderer' to display the name of the features and 
'FeatureLabelRenderer'to display the labels but after several different
tries nothing is displayed by both the classes.

I haven't seen any use of this classes in the demos, have someone
already used them and do you know where I can find examples of uses ?


Thanks

Franck
From boehme at mpiib-berlin.mpg.de  Tue Jun 21 05:46:22 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Tue Jun 21 05:37:59 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <78e39420822012ffbf691b5edc233b4a@gnf.org>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB80@BIONIC.biopolis.one-north.com>
	<78e39420822012ffbf691b5edc233b4a@gnf.org>
Message-ID: <42B7E1EE.5090505@mpiib-berlin.mpg.de>

Hi Hilmar,

I wasn't aware of 2 different types of features.
I'm making features as described in 
http://www.biojava.org/docs/bj_in_anger/feature.htm, and as far as I 
can tell from the results, its the first type you describe.
The second type of feature is confusing me: as I understood the 
feature relationships, the graph is a tree, with only one parent for a 
given feature, and if that feature is deleted, all its children should 
get deleted too?

Martina


Hilmar Lapp wrote:

> There's one thing that I'm unsure about in Martina's original email, 
> namely whether she was referring to features related to a sequence 
> (bioentry), or to features hierarchically related to each other through 
> the seqfeature_relationship table.
> 
> If the former, then the cascading delete should have taken care of 
> removing the features when you remove the sequence (bioentry) to which 
> they point through their foreign key (and recursively the locations etc).
> 
> However, if the question was about hierarchical features, then deleting 
> one feature in the hierarchy will never (and shouldn't ever) delete any 
> other feature in the hierarchy (except if all of them reference the same 
> bioentry and you deleted the bioentry). If you delete a seqfeature in a 
> hierarchy of seqfeatures then by cascading delete this will also delete 
> all rows in seqfeature_relationship that reference that seqfeature as 
> either a subject or an object in a nesting relationship between 
> features. I.e., looking at the hierarchy as a graph, removing a node 
> will cascade to deleting all incoming and outgoing arcs for that node, 
> but not other nodes.
> 
> If your application wants to take down all nodes in the hierarchy when 
> one node is deleted, you need to write code to do this. (Except if, as 
> mentioned before, all features reference the same bioentry, in which 
> case deleting the bioentry will delete the entire feature hierarchy.)
> 
>     -hilmar
> 
> On Jun 20, 2005, at 3:33 AM, Richard HOLLAND wrote:
> 
>> Well, technically that should work because BioJava simply issues a
>> delete against the seqfeature table, and therefore all features related
>> through foreign keys should automatically delete themselves as a result
>> without any further intervention by BioJava... beats me why it doesn't!
>> Unfortunately I don't currently use the MySQL implementation myself so I
>> can't help much. I hope someone on BioSQL-L knows a little more?
>>
>> Richard Holland
>> Bioinformatics Specialist
>> GIS extension 8199
>> ---------------------------------------------
>> This email is confidential and may be privileged. If you are not the
>> intended recipient, please delete it and notify us immediately. Please
>> do not copy or use it for any purpose, or disclose its content to any
>> other person. Thank you.
>> ---------------------------------------------
>>
>>
>>> -----Original Message-----
>>> From: Martina [mailto:boehme@mpiib-berlin.mpg.de]
>>> Sent: Monday, June 20, 2005 6:21 PM
>>> To: Richard HOLLAND
>>> Cc: biosql-l-bounces@portal.open-bio.org; BioJava;
>>> biosql-l@open-bio.org
>>> Subject: Re: [BioSQL-l] _removeSequence
>>>
>>>
>>> My tables are all InnoDB tables and in the biosqldb-mysql.sql (v 1.40
>>> 2004/11/04 01:49:41) which created them, it says ON DELETE CASCADE.
>>> Do I need to do anything else?
>>>
>>> Thanks,
>>> Martina
>>>
>>> Richard HOLLAND wrote:
>>>
>>>> To do cascading deletes in MySQL requires the tables to
>>>
>>> have been set up
>>>
>>>> using the InnoDB table style (as opposed to the default
>>>
>>> MyISAM tables).
>>>
>>>> In InnoDB, foreign keys are actually enforced and deletes
>>>
>>> will cascade,
>>>
>>>> whereas in MyISAM it has no concept of foreign keys and so
>>>
>>> is unable to
>>>
>>>> enforce data integrity. The people on the BioSQL-L mailing
>>>
>>> list will be
>>>
>>>> able to help you there.
>>>>
>>>> The next version of BioJava's database interfaces after the
>>>
>>> 1.4 release
>>>
>>>> will assume that the underlying database does have cascading deletes
>>>> turned on. The existing version half-attempts to make up
>>>
>>> for the lack of
>>>
>>>> cascading deletes in databases that don't support it, but
>>>
>>> it doesn't do
>>>
>>>> it well at all, hence the problems you are seeing. After
>>>
>>> consulting with
>>>
>>>> Hilmar last week we decided it was a fair assumption to
>>>
>>> make that all
>>>
>>>> BioSQL instances are installed with cascading deletes enabled.
>>>> BioPerl-db already makes this assumption.
>>>>
>>>> cheers,
>>>> Richard
>>>>
>>>> Richard Holland
>>>> Bioinformatics Specialist
>>>> GIS extension 8199
>>>> ---------------------------------------------
>>>> This email is confidential and may be privileged. If you are not the
>>>> intended recipient, please delete it and notify us
>>>
>>> immediately. Please
>>>
>>>> do not copy or use it for any purpose, or disclose its
>>>
>>> content to any
>>>
>>>> other person. Thank you.
>>>> ---------------------------------------------
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: biosql-l-bounces@portal.open-bio.org
>>>>> [mailto:biosql-l-bounces@portal.open-bio.org] On Behalf Of
>>>>> mark.schreiber@novartis.com
>>>>> Sent: Monday, June 20, 2005 5:57 PM
>>>>> To: Martina
>>>>> Cc: biosql-l-bounces@portal.open-bio.org; BioJava;
>>>>> biosql-l@open-bio.org
>>>>> Subject: Re: [BioSQL-l] _removeSequence
>>>>>
>>>>>
>>>>> Biojava doesn't attempt to recusivley remove features by
>>>>> itself. It relies
>>>>> on cascading deletes in the database. I know Oracle can be
>>>>> set to do this
>>>>> (and it works very well). If MySQL has equivalent
>>>>> functionality you may
>>>>> need to turn it on. I'm pretty sure it does but you need to
>>>
>>> set it up.
>>>
>>>>>
>>>>> - Mark
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Martina <boehme@mpiib-berlin.mpg.de>
>>>>> Sent by: biosql-l-bounces@portal.open-bio.org
>>>>> 06/20/2005 05:43 PM
>>>>>
>>>>>
>>>>>        To:     biosql-l@open-bio.org, BioJava
>>>
>>> <biojava-l@biojava.org>
>>>
>>>>>        cc:     (bcc: Mark Schreiber/GP/Novartis)
>>>>>        Subject:        [BioSQL-l] _removeSequence
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> Im trying to delete a sequence and recursivly all its features.
>>>>>
>>>>> So:
>>>>>
>>>>> for (SequenceIterator si = db.sequenceIterator(); si.hasNext();) {
>>>>>                 Sequence s = si.nextSequence();
>>>>>                 String name = s.getName();
>>>>>                 s = null;
>>>>>                 db.removeSequence(name);
>>>>> }
>>>>>
>>>>> But if I look in the database (MySQL  4.1.12) I can still
>>>
>>> see plenty
>>>
>>>>> of entries and I have problems entering the same features again,
>>>>> because of dublicate key error. I would like to know if
>>>>> _removeSequence(String) in BioSQLSequenceDB is supposed to remove
>>>>> features recursivly or just the features of the removed sequence?
>>>>> If so - what is the best way do delete the features of the features
>>>>> (and so on)? And how to empty the db completly?
>>>>>
>>>>> Martina
>>>>>
>>>>> _______________________________________________
>>>>> BioSQL-l mailing list
>>>>> BioSQL-l@open-bio.org
>>>>> http://open-bio.org/mailman/listinfo/biosql-l
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> BioSQL-l mailing list
>>>>> BioSQL-l@open-bio.org
>>>>> http://open-bio.org/mailman/listinfo/biosql-l
>>>>
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> BioSQL-l mailing list
>> BioSQL-l@open-bio.org
>> http://open-bio.org/mailman/listinfo/biosql-l
>>
From boehme at mpiib-berlin.mpg.de  Tue Jun 21 06:10:16 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Tue Jun 21 06:02:40 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <f5bb76b54331dc88107ebde4bee3dc46@gnf.org>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
	<42B6DEC3.9090807@mpiib-berlin.mpg.de>
	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>
Message-ID: <42B7E788.3040205@mpiib-berlin.mpg.de>


> Yes. When you insert a sequence you must be prepared that when inserting 
> its ontology term or tag/value annotation the term may already be 
> present because another bioentry uses it too.

Ok, the proper way is to catch the SQLException in BIOSQLFeature, test 
if it is a Dublicate key entry, get the identifier of the term (would 
that be the BioSQLfeatureId ?) and insert it in the term_relationship 
table? And there is no nice BioJava method for this, I have to do it 
"manually", like conn.prepareStatement(..) and stuff?  BioJava spoiled 
me so!

Martina
From jesse-t at chello.nl  Tue Jun 21 09:06:12 2005
From: jesse-t at chello.nl (Jesse)
Date: Tue Jun 21 08:57:50 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
In-Reply-To: <OFD454BB21.D398DC03-ON48257027.001F54EF-48257027.001FA9CE@EU.novartis.net>
Message-ID: <20050621130611.TSND24432.amsfep19-int.chello.nl@anonymous>

I think I found the problem.

The Restriction Enzyme name (<1>) of an entry can be linked to the
isoschizomers field (<2>) of other entries. So when I remove an entry, I
also have to remove those names in the isoschizomers field of other entries.

So it's not a bug.

- Jesse


-----Oorspronkelijk bericht-----
Van: mark.schreiber@novartis.com [mailto:mark.schreiber@novartis.com] 
Verzonden: dinsdag 21 juni 2005 7:46
Aan: Jesse
CC: biojava-l@biojava.org; biojava-l-bounces@portal.open-bio.org
Onderwerp: Re: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?

I guess the other question I would ask is, should it crash?

By modifying the file are you fundamentally changing the format? The 
NullPointerException seems to suggest that you inserted something that 
doesn't have a matching record (or some similar problem, I'm not familiar 
with REBASE).

Try taking a look into the RestrictionEnzymeManager code at the root of 
the exception to give you some clues what might be going wrong. It's hard 
to tell if this is actually a bug or if you have incorrectly modified the 
file.

- Mark


"Jesse" <jesse-t@chello.nl>
Sent by: biojava-l-bounces@portal.open-bio.org
06/21/2005 11:39 AM

 
        To:     <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzymeManager REBASE reader
bug?


Hi Mark,

With "crash" I mean an exception.

This one:
-------- Exception  ---------
Exception in thread "main" org.biojava.bio.BioError: Failed to read REBASE
data file
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:415)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.<clinit>(RestrictionEnzymeMa
nager.java:136)
                 at RETools.printAllRE(RETools.java:32)
                 at RETools.main(RETools.java:15)
Caused by: java.lang.NullPointerException
                 at org.biojava.utils.SmallSet.contains(SmallSet.java:68)
                 at org.biojava.utils.SmallSet.add(SmallSet.java:81)
                 at
org.biojava.bio.molbio.RestrictionEnzymeManager.loadData(RestrictionEnzymeMa
nager.java:407)
                 ... 3 more
-------------------------

OS: Microsoft Windows XP Professional SP 2 [Version 5.1.2600]
Java:
java version "1.5.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-b09)
Java HotSpot(TM) Client VM (build 1.5.0_02-b09, mixed mode, sharing)
BioJava: BioJava 1.4pre2

The problem I described happens when calling
RestrictionEnzymeManager.getAllEnzymes() on a modified REBASE file.
What I modified (as test) was only the name "<1>XmnI" to "<1>XmbnI" (line
35343 of REBASE format 31, version 506).

Sometimes the exception also occurs when removing some specific 
restriction
enzyme entries (from <1> to <8> including the trailing empty line).


_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From boehme at mpiib-berlin.mpg.de  Tue Jun 21 09:55:15 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Tue Jun 21 09:52:33 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <0be3992b92f6a14b6d06d5a06549555b@gnf.org>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>
	<42B6DEC3.9090807@mpiib-berlin.mpg.de>
	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>
	<42B7E788.3040205@mpiib-berlin.mpg.de>
	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
Message-ID: <42B81C43.9010404@mpiib-berlin.mpg.de>

That means, that I can't have 2 features refering to the same bioentry 
with the same type (= type_term_id)and source (=source_term_id) but 
different parent features because of the composite key bioentry_id in 
the seqfeature table? Or what does "rank" in that table mean (its part 
of that key), how can I get different ranks?

Martina

Hilmar Lapp wrote:

> The Biojava people will respond to this. Note though that 
> Term_Relationship is for storing subject-predicate-object triples of 
> terms, so I'm not sure why you want to use it for storing/associating 
> annotation. Maybe you meant bioentry_qualifier_value?
> 
>     -hilmar
> 
> On Jun 21, 2005, at 3:10 AM, Martina wrote:
> 
>>
>>> Yes. When you insert a sequence you must be prepared that when 
>>> inserting its ontology term or tag/value annotation the term may 
>>> already be present because another bioentry uses it too.
>>
>>
>> Ok, the proper way is to catch the SQLException in BIOSQLFeature, test 
>> if it is a Dublicate key entry, get the identifier of the term (would 
>> that be the BioSQLfeatureId ?) and insert it in the term_relationship 
>> table? And there is no nice BioJava method for this, I have to do it 
>> "manually", like conn.prepareStatement(..) and stuff?  BioJava spoiled 
>> me so!
>>
>> Martina
>>
From gwaldon at geneinfinity.org  Tue Jun 21 12:12:53 2005
From: gwaldon at geneinfinity.org (george waldon)
Date: Tue Jun 21 12:05:11 2005
Subject: =?US-ASCII?B?UkU6IFtCaW9qYXZhLWxdIFJlc3RyaWN0aW9uRW56eW1lTWFuYWdlciBSRUJBU0UgcmVhZGVyIGJ1Zz8=?=
Message-ID: <200506211612.j5LGCrQp078068@mmm1924.dulles19-verio.com>

Of course it's a bug and I reported it a while ago:

Dated from Wed 5/11/2005 11:31 AM
"There is also a bug I found a while ago. In RestrictionEnzymeManager.java, around 2/3 down, put

for (Iterator ii = isoschizomers.iterator(); ii.hasNext();) {
    String isoName = (String) ii.next();
    Object re = nameToEnzyme.get(isoName);
    if(re!=null)
        tempSet.add(re);
}

helps to deal with isoschizomers."

A mean to track bugs would be nice but more important I think would be a searchable mail archive. I remember there is a way to search biojava archive somewhere but I couldn't find it on the biojava web site. Would be nice to have a link on the site.

- George

-----Original Message-----
From: biojava-l-bounces@portal.open-bio.org [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of Jesse
Sent: Tuesday, June 21, 2005 6:06 AM
To: biojava-l@biojava.org
Subject: RE: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?

I think I found the problem.

The Restriction Enzyme name (<1>) of an entry can be linked to the
isoschizomers field (<2>) of other entries. So when I remove an entry, I
also have to remove those names in the isoschizomers field of other entries.

So it's not a bug.

- Jesse
From simon.foote at nrc-cnrc.gc.ca  Tue Jun 21 12:15:45 2005
From: simon.foote at nrc-cnrc.gc.ca (Simon Foote)
Date: Tue Jun 21 12:06:44 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
In-Reply-To: <42B81C43.9010404@mpiib-berlin.mpg.de>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>	<42B6DEC3.9090807@mpiib-berlin.mpg.de>	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>	<42B7E788.3040205@mpiib-berlin.mpg.de>	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
	<42B81C43.9010404@mpiib-berlin.mpg.de>
Message-ID: <42B83D31.2000403@nrc-cnrc.gc.ca>

Hi Martina,

In fact you can, as rank is the field that allows this to happen.  In 
Biojava, currently it's just a linearily incremented number such that 
you can have the same type and source IDs for a given bioentry.

For example, adding a Genbank entry with 10 CDS features for 1 bioentry 
will give you identical keys for bioentry_id, type_term_id and 
source_term_id, but will have a rank of 1 - 10 for each.

Simon

Martina wrote:

> That means, that I can't have 2 features refering to the same bioentry 
> with the same type (= type_term_id)and source (=source_term_id) but 
> different parent features because of the composite key bioentry_id in 
> the seqfeature table? Or what does "rank" in that table mean (its part 
> of that key), how can I get different ranks?
>
> Martina
>
> Hilmar Lapp wrote:
>
>> The Biojava people will respond to this. Note though that 
>> Term_Relationship is for storing subject-predicate-object triples of 
>> terms, so I'm not sure why you want to use it for storing/associating 
>> annotation. Maybe you meant bioentry_qualifier_value?
>>
>>     -hilmar
>>
>> On Jun 21, 2005, at 3:10 AM, Martina wrote:
>>
>>>
>>>> Yes. When you insert a sequence you must be prepared that when 
>>>> inserting its ontology term or tag/value annotation the term may 
>>>> already be present because another bioentry uses it too.
>>>
>>>
>>>
>>> Ok, the proper way is to catch the SQLException in BIOSQLFeature, 
>>> test if it is a Dublicate key entry, get the identifier of the term 
>>> (would that be the BioSQLfeatureId ?) and insert it in the 
>>> term_relationship table? And there is no nice BioJava method for 
>>> this, I have to do it "manually", like conn.prepareStatement(..) and 
>>> stuff?  BioJava spoiled me so!
>>>
>>> Martina
>>>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l


-- 
Bioinformatics Programmer
Pathogen Genomics
Institute for Biological Sciences
National Research Council of Canada
[T] 613-990-0561  [F] 613-952-9092
simon.foote@nrc-cnrc.gc.ca

From kturner at idtdna.com  Tue Jun 21 15:17:02 2005
From: kturner at idtdna.com (Keith Turner)
Date: Tue Jun 21 15:08:33 2005
Subject: [Biojava-l] Using SeqIOTools in a JNLP context
Message-ID: <03D1119D99B98D4D9762E01F1D4FB980010FA82F@EXCHANGE.idtdna.com>

Hello-

I am new to the list.  I enjoy working with the Biojava API, but a problem has arisen for me, and I need some help with it.  I am developing an application to be used in the Java Webstart framework, and this brings with it some interesting file permission issues.  Basically, you use the JNLP interface FileOpenService to open a file from within the secure "sandbox" environment, and then you can get an InputStream out of that.

So I want to take this InputStream (which presumably is from a Fasta file), and read a DNA sequence from it.  However, all the methods that worked when I was running my software as a Java application no longer work in the JNLP environment.  In the past, I was doing:
  InputStreamReader fr = new InputStreamReader(in);
  BufferedReader br = new BufferedReader(fr);
  SequenceIterator stream = SeqIOTools.readFastaDNA(br);
  Sequence seq = stream.nextSequence();
But the program freezes on the SeqIOTools.readFastaDNA(br) call.  No exception is thrown back, it just does nothing.  Does anyone have any suggestions as to how I can solve or work around this problem?  Thank you very much

-Keith Turner

From ap3 at sanger.ac.uk  Tue Jun 21 18:08:19 2005
From: ap3 at sanger.ac.uk (Andreas Prlic)
Date: Tue Jun 21 17:57:40 2005
Subject: [Biojava-l] Using SeqIOTools in a JNLP context
In-Reply-To: <03D1119D99B98D4D9762E01F1D4FB980010FA82F@EXCHANGE.idtdna.com>
References: <03D1119D99B98D4D9762E01F1D4FB980010FA82F@EXCHANGE.idtdna.com>
Message-ID: <8a9eaf07220858b58692215950175e85@sanger.ac.uk>

Hi Keith,

You should get  an java.security.AccessControlException: access denied  
from webstart.
To access the filesystem from an application started with webstart 
requires special permission. This means you have to sign your 
application and the user has to permit the execution.

see e.g.
http://java.sun.com/docs/books/tutorial/security1.2/toolsign/signer.html
Cheers,
Andreas

On 21 Jun 2005, at 20:17, Keith Turner wrote:

> Hello-
>
> I am new to the list.  I enjoy working with the Biojava API, but a 
> problem has arisen for me, and I need some help with it.  I am 
> developing an application to be used in the Java Webstart framework, 
> and this brings with it some interesting file permission issues.  
> Basically, you use the JNLP interface FileOpenService to open a file 
> from within the secure "sandbox" environment, and then you can get an 
> InputStream out of that.
>
> So I want to take this InputStream (which presumably is from a Fasta 
> file), and read a DNA sequence from it.  However, all the methods that 
> worked when I was running my software as a Java application no longer 
> work in the JNLP environment.  In the past, I was doing:
>   InputStreamReader fr = new InputStreamReader(in);
>   BufferedReader br = new BufferedReader(fr);
>   SequenceIterator stream = SeqIOTools.readFastaDNA(br);
>   Sequence seq = stream.nextSequence();
> But the program freezes on the SeqIOTools.readFastaDNA(br) call.  No 
> exception is thrown back, it just does nothing.  Does anyone have any 
> suggestions as to how I can solve or work around this problem?  Thank 
> you very much
>
> -Keith Turner
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
>
-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891

From mark.schreiber at novartis.com  Tue Jun 21 20:48:57 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 21 20:40:38 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <OF3DA5F2C6.96D9383E-ON48257028.00042D20-48257028.00047BE8@EU.novartis.net>

Oops. I was supposed to check that in.

A bug tracking feature would be nice although I fear that the number of 
hands available to fix those tracked bugs might be severely limiting. If 
people know of good and free systems I could reccomend them to the 
open-bio admins.

There was some talk of a searchable mail archive a while ago (although 
google seems to do a pretty good job of indexing our mail). I'll try and 
follow it up.

- Mark


"george waldon" <gwaldon@geneinfinity.org>
Sent by: biojava-l-bounces@portal.open-bio.org
06/22/2005 12:12 AM
Please respond to george waldon

 
        To:     Biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        RE: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?


Of course it's a bug and I reported it a while ago:

Dated from Wed 5/11/2005 11:31 AM
"There is also a bug I found a while ago. In 
RestrictionEnzymeManager.java, around 2/3 down, put

for (Iterator ii = isoschizomers.iterator(); ii.hasNext();) {
    String isoName = (String) ii.next();
    Object re = nameToEnzyme.get(isoName);
    if(re!=null)
        tempSet.add(re);
}

helps to deal with isoschizomers."

A mean to track bugs would be nice but more important I think would be a 
searchable mail archive. I remember there is a way to search biojava 
archive somewhere but I couldn't find it on the biojava web site. Would be 
nice to have a link on the site.

- George

-----Original Message-----
From: biojava-l-bounces@portal.open-bio.org 
[mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of Jesse
Sent: Tuesday, June 21, 2005 6:06 AM
To: biojava-l@biojava.org
Subject: RE: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?

I think I found the problem.

The Restriction Enzyme name (<1>) of an entry can be linked to the
isoschizomers field (<2>) of other entries. So when I remove an entry, I
also have to remove those names in the isoschizomers field of other 
entries.

So it's not a bug.

- Jesse
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From mark.schreiber at novartis.com  Tue Jun 21 22:22:52 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 21 22:14:29 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
Message-ID: <OFC26005B6.CC7F1A08-ON48257028.000CD334-48257028.000D1536@EU.novartis.net>

Hello -

This is now checked in. All tests pass (no surprise as checking for null 
never hurt anyone). This will make it into biojava1.4. If you want to add 
a test to the Junit to ensure this stays fixed it would be most 
appreciated.

I also remember some discussion a while back about the behaivour of 
certain enzymes with respect to their cleavage points which may or may not 
have been a bug. Was this ever resolved? If so does anything need fixing?

Thanks.

- Mark


"george waldon" <gwaldon@geneinfinity.org>
Sent by: biojava-l-bounces@portal.open-bio.org
06/22/2005 12:12 AM
Please respond to george waldon

 
        To:     Biojava-l@biojava.org
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        RE: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?


Of course it's a bug and I reported it a while ago:

Dated from Wed 5/11/2005 11:31 AM
"There is also a bug I found a while ago. In 
RestrictionEnzymeManager.java, around 2/3 down, put

for (Iterator ii = isoschizomers.iterator(); ii.hasNext();) {
    String isoName = (String) ii.next();
    Object re = nameToEnzyme.get(isoName);
    if(re!=null)
        tempSet.add(re);
}

helps to deal with isoschizomers."

A mean to track bugs would be nice but more important I think would be a 
searchable mail archive. I remember there is a way to search biojava 
archive somewhere but I couldn't find it on the biojava web site. Would be 
nice to have a link on the site.

- George

-----Original Message-----
From: biojava-l-bounces@portal.open-bio.org 
[mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of Jesse
Sent: Tuesday, June 21, 2005 6:06 AM
To: biojava-l@biojava.org
Subject: RE: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?

I think I found the problem.

The Restriction Enzyme name (<1>) of an entry can be linked to the
isoschizomers field (<2>) of other entries. So when I remove an entry, I
also have to remove those names in the isoschizomers field of other 
entries.

So it's not a bug.

- Jesse
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From mark.schreiber at novartis.com  Wed Jun 22 01:55:56 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 22 01:47:26 2005
Subject: [Biojava-l] searching mailing lists
Message-ID: <OF73EA2ED7.1145C69D-ON48257028.00207711-48257028.002096AC@EU.novartis.net>

I have found the open-bio search page (http://search.open-bio.org/) you 
can use this to search mailing lists and webpages for any open-bio 
project.

- Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com

phone +65 6722 2973
fax  +65 6722 2910

From jesse-t at chello.nl  Wed Jun 22 04:15:32 2005
From: jesse-t at chello.nl (Jesse)
Date: Wed Jun 22 04:07:03 2005
Subject: [Biojava-l] RestrictionEnzymeManager can't correctly handle
	incomplete enzymes
Message-ID: <20050622081531.CHWO1610.amsfep12-int.chello.nl@anonymous>


RestrictionEnzymeManager can't correctly handle incomplete enzymes and gives
wrong data.

(Correct me if I'm wrong.)

I'm not sure if this is already discussed or not.

I think RestrictionEnzymeManager can not handle incomplete restriction
enzymes.

BioJava 1.4Pre2 knows two types of RestrictionEnzymes:
-RestrictionEnzyme.CUT_SIMPLE
-RestrictionEnzyme.CUT_COMPOUND

But in REBASE, there are also other restriction enzyme entries:
-Unknown recognition sites. For example "<3>?". RestrictionEnzymeManager
skips this one (which is ok).
-Unknown cut location. For example AacI "<3>GGATCC".

The problem with RestrictionEnzymeManager is with those REBASE entries which
have an unknown cutlocation. RestrictionEnzymeManager  will actually tell
that there is a cutlocation, even though it's unknown in the REBASE file.

For example:
http://rebase.neb.com/rebase/link_withrefm
--------- REBASE ENTRY -----------
<1>AacI
<2>BamHI,AaeI,AcaII,AccEBI,AinII,AliI,Ali12257I,Ali12258I,ApaCI,AsiI,AspTII,
Atu1II,BamFI,BamKI,BamNI,Bca1259I,Bce751I,Bco10278I,BnaI,BsaDI,Bsp30I,Bsp46I
,Bsp90II,Bsp98I,Bsp130I,Bsp131I,Bsp144I,Bsp4009I,BspAAIII,BstI,Bst1126I,Bst2
464I,Bst2902I,BstQI,Bsu90I,Bsu8565I,Bsu8646I,BsuB519I,BsuB763I,CelI,DdsI,Gdo
I,GinI,GoxI,GseIII,GstI,MleI,Mlu23I,NasBI,Nsp29132II,NspSAIV,OkrAI,Pac1110I,
Pae177I,Pfl8I,Psp56I,RhsI,Rlu4I,RspLKII,SolI,SpvI,SurI,Uba19I,Uba31I,Uba38I,
Uba51I,Uba88I,Uba1098I,Uba1163I,Uba1167I,Uba1172I,Uba1173I,Uba1205I,Uba1224I
,Uba1242I,Uba1250I,Uba1258I,Uba1297I,Uba1302I,Uba1324I,Uba1325I,Uba1334I,Uba
1339I,Uba1346I,Uba1383I,Uba1398I,Uba1402I,Uba1414I,Uba4009I
<3>GGATCC
<4>
<5>Acetobacter aceti sub. liquefaciens
<6>IFO 12388
<7>
<8>Seurinck, J., van Montagu, M., Unpublished observations.
----------------------------------


--------- RestrictionEnzyme values --------
Name: AacI
RecognitionSite:ggatcc
ForwardRegex: g{2}atc{2}
ReverseRegex: g{2}atc{2}
CutType: 0 (RestrictionEnzyme.CUT_SIMPLE)
DownStreamEndType: 2
IsPalindromic: true
DownstreamCut: 1, 1,
-------------------------------------------

As you can see, AaCI is used as RestrictionEnzyme.CUT_SIMPLE and it has a
cutlocation while the REBASE entry says that the cutlocation is unknown,
only the recognition site is known. So RestrictionEnzymeManager should also
filter out those with an unknown cutlocation, otherwise it gives wrong data.

- Jesse


[Biojava-l] RestrictionEnzymeManager REBASE reader bug?
mark.schreiber at novartis.com mark.schreiber at novartis.com 
Tue Jun 21 22:22:52 EDT 2005 

Hello -

This is now checked in. All tests pass (no surprise as checking for null 
never hurt anyone). This will make it into biojava1.4. If you want to add 
a test to the Junit to ensure this stays fixed it would be most 
appreciated.

I also remember some discussion a while back about the behaivour of 
certain enzymes with respect to their cleavage points which may or may not 
have been a bug. Was this ever resolved? If so does anything need fixing?

Thanks.

- Mark

From boehme at mpiib-berlin.mpg.de  Wed Jun 22 05:24:08 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Wed Jun 22 05:16:15 2005
Subject: [Biojava-l] update seqfeature 
In-Reply-To: <42B83D31.2000403@nrc-cnrc.gc.ca>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>	<42B6DEC3.9090807@mpiib-berlin.mpg.de>	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>	<42B7E788.3040205@mpiib-berlin.mpg.de>	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
	<42B81C43.9010404@mpiib-berlin.mpg.de>
	<42B83D31.2000403@nrc-cnrc.gc.ca>
Message-ID: <42B92E38.2020008@mpiib-berlin.mpg.de>

Hi Simon,

I'm changing the FeatureSource and in setFeatureSource an update on 
the source_term_id happens. In the case the combination is already 
there, I get an Exception. The proper way to deal with that would be 
to get the seqfeature_id of the entry already there and use that, or 
try to update the rank unless its a unique combination? Or should I 
rather not mess with the BioJava and delete that entry and insert it 
as new to let BioJava handle the rank increase?

Thanks for any advise

Martina

Simon Foote wrote:

> Hi Martina,
> 
> In fact you can, as rank is the field that allows this to happen.  In 
> Biojava, currently it's just a linearily incremented number such that 
> you can have the same type and source IDs for a given bioentry.
> 
> For example, adding a Genbank entry with 10 CDS features for 1 bioentry 
> will give you identical keys for bioentry_id, type_term_id and 
> source_term_id, but will have a rank of 1 - 10 for each.
> 
> Simon
> 

From mark.schreiber at novartis.com  Wed Jun 22 05:24:52 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 22 05:17:08 2005
Subject: [Biojava-l] RestrictionEnzymeManager can't correctly
	handle	incomplete enzymes
Message-ID: <OF2770DA02.A1F47A64-ON48257028.003378A3-48257028.0033B785@EU.novartis.net>

I take your point but I notice that BamHI is an isoscizomer. Is the 
cleavage site of BamHI really unknown??

- Mark


"Jesse" <jesse-t@chello.nl>
Sent by: biojava-l-bounces@portal.open-bio.org
06/22/2005 04:15 PM

 
        To:     <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzymeManager can't correctly handle     incomplete 
enzymes


RestrictionEnzymeManager can't correctly handle incomplete enzymes and 
gives
wrong data.

(Correct me if I'm wrong.)

I'm not sure if this is already discussed or not.

I think RestrictionEnzymeManager can not handle incomplete restriction
enzymes.

BioJava 1.4Pre2 knows two types of RestrictionEnzymes:
-RestrictionEnzyme.CUT_SIMPLE
-RestrictionEnzyme.CUT_COMPOUND

But in REBASE, there are also other restriction enzyme entries:
-Unknown recognition sites. For example "<3>?". RestrictionEnzymeManager
skips this one (which is ok).
-Unknown cut location. For example AacI "<3>GGATCC".

The problem with RestrictionEnzymeManager is with those REBASE entries 
which
have an unknown cutlocation. RestrictionEnzymeManager  will actually tell
that there is a cutlocation, even though it's unknown in the REBASE file.

For example:
http://rebase.neb.com/rebase/link_withrefm
--------- REBASE ENTRY -----------
<1>AacI
<2>BamHI,AaeI,AcaII,AccEBI,AinII,AliI,Ali12257I,Ali12258I,ApaCI,AsiI,AspTII,
Atu1II,BamFI,BamKI,BamNI,Bca1259I,Bce751I,Bco10278I,BnaI,BsaDI,Bsp30I,Bsp46I
,Bsp90II,Bsp98I,Bsp130I,Bsp131I,Bsp144I,Bsp4009I,BspAAIII,BstI,Bst1126I,Bst2
464I,Bst2902I,BstQI,Bsu90I,Bsu8565I,Bsu8646I,BsuB519I,BsuB763I,CelI,DdsI,Gdo
I,GinI,GoxI,GseIII,GstI,MleI,Mlu23I,NasBI,Nsp29132II,NspSAIV,OkrAI,Pac1110I,
Pae177I,Pfl8I,Psp56I,RhsI,Rlu4I,RspLKII,SolI,SpvI,SurI,Uba19I,Uba31I,Uba38I,
Uba51I,Uba88I,Uba1098I,Uba1163I,Uba1167I,Uba1172I,Uba1173I,Uba1205I,Uba1224I
,Uba1242I,Uba1250I,Uba1258I,Uba1297I,Uba1302I,Uba1324I,Uba1325I,Uba1334I,Uba
1339I,Uba1346I,Uba1383I,Uba1398I,Uba1402I,Uba1414I,Uba4009I
<3>GGATCC
<4>
<5>Acetobacter aceti sub. liquefaciens
<6>IFO 12388
<7>
<8>Seurinck, J., van Montagu, M., Unpublished observations.
----------------------------------


--------- RestrictionEnzyme values --------
Name: AacI
RecognitionSite:ggatcc
ForwardRegex: g{2}atc{2}
ReverseRegex: g{2}atc{2}
CutType: 0 (RestrictionEnzyme.CUT_SIMPLE)
DownStreamEndType: 2
IsPalindromic: true
DownstreamCut: 1, 1,
-------------------------------------------

As you can see, AaCI is used as RestrictionEnzyme.CUT_SIMPLE and it has a
cutlocation while the REBASE entry says that the cutlocation is unknown,
only the recognition site is known. So RestrictionEnzymeManager should 
also
filter out those with an unknown cutlocation, otherwise it gives wrong 
data.

- Jesse


[Biojava-l] RestrictionEnzymeManager REBASE reader bug?
mark.schreiber at novartis.com mark.schreiber at novartis.com 
Tue Jun 21 22:22:52 EDT 2005 

Hello -

This is now checked in. All tests pass (no surprise as checking for null 
never hurt anyone). This will make it into biojava1.4. If you want to add 
a test to the Junit to ensure this stays fixed it would be most 
appreciated.

I also remember some discussion a while back about the behaivour of 
certain enzymes with respect to their cleavage points which may or may not 

have been a bug. Was this ever resolved? If so does anything need fixing?

Thanks.

- Mark

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From jesse-t at chello.nl  Wed Jun 22 06:09:01 2005
From: jesse-t at chello.nl (Jesse)
Date: Wed Jun 22 06:01:08 2005
Subject: [Biojava-l] RestrictionEnzymeManager can't correctly
	handle	incomplete enzymes
Message-ID: <20050622100859.JIQP1610.amsfep12-int.chello.nl@anonymous>

(I'm not an expert on restriction enzymes.)

 
I was talking about AacI, of which BamHI is an isoschizomer. The recognition
site of AacI is unknown, but the one from BamHI is known. Maybe
RestrictionEnzymeManager uses the cutlocation of BamHI when asking the
unknown cutlocation of AacI.

http://rebase.neb.com/rebase/enz/AacI.html

 
That might also be the reason why RestrictionEnzymeManager requires links
between restriction enzymes. If a restriction enzyme entry is removed from
the REBASE file RestrictionEnzymeManager fails to read in some cases.

 
But I think using cutlocation of isoschizomers is wrong. Because of this:

 
REBASE says: "A isoschizomers is a restriction enzymes that recognize the
same DNA sequence. The cut sites may or may not be identical."

So the cut site might be different between different isoschizomers.

 
I searched for examples in the REBASE file, and found them:

 
<1>BspKT6I

<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstMBI,
BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1786
I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,CcoP
31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338I,C
paI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,FnuAII
,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I,Ls
p1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,Mk
rAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,N
deII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,PfaI
,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI,
SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu247
9I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,R
2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I,
TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Uba
1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I

<3>GAT^C

<4>2(6)

<5>Bacillus species KT6

<6>N.I. Matvienko

<7>

<8>Shapovalova, N.I., Zheleznaja, L.A., Matvienko, N.I., (1993) Nucleic
Acids Res., vol. 21, pp. 5794.

Shapovalova, N.I., Zheleznaya, L.A., Matvienko, N.I., (1994) Biokhimiia,
vol. 59, pp. 1730-1738.

 
<1>MboI

<2>AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscFI,Bsm
XII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,Bsp60
I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105I,Bsp
122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI,BspJ
64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstM
BI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1
786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,C
coP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338
I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,Fnu
AII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I
,Lsp1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII
,MkrAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciA
I,NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,P
faI,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,Sau
EI,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu
2479I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074
I,R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth36
8I,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,
Uba1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I

<3>^GATC

<4>2(6)

<5>Moraxella bovis

<6>ATCC 10900

<7>ACFGKNQRUVX

<8>Anton, B.P., Brooks, J.E., Unpublished observations.

Gelinas, R.E., Myers, P.A., Roberts, R.J., (1977) J. Mol. Biol., vol. 114,
pp. 169-179.

Huang, L.-H., Farnet, C.M., Ehrlich, K.C., Ehrlich, M., (1982) Nucleic Acids
Res., vol. 10, pp. 1579-1591.

Ueno, T., Ito, H., Kimizuka, F., Kotani, H., Nakajima, K., (1993) Nucleic
Acids Res., vol. 21, pp. 2309-2313.

Ueno, T., Ito, H., Kotani, H., Nakajima, K., Japanese Patent Office, 1993.

 
<1>Mel3JI

<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI
,BstMBI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I
,Bth1786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,C
acI,CcoP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,Cj
eP338I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHC
I,FnuAII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,Ll
aKR2I,Lsp1109II,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,M
krAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,
NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,Pfa
I,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI
,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu24
79I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,
R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I
,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Ub
a1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I

<3>GATC

<4>

<5>Megasphaera elsedenii 3J

<6>P. Pristas

<7>

<8>Piknova, M., Filova, M., Javorsky, P., Pristas, P., (2004) FEMS
Microbiol. Lett., vol. 236, pp. 91-95.

Piknova, M., Pristas, P., Javorsky, P., (2004) Folia Microbiol. (Praha),
vol. 49, pp. 191-193.

 
-----Oorspronkelijk bericht-----

Van: mark.schreiber@novartis.com [ <mailto:mark.schreiber@novartis.com>
mailto:mark.schreiber@novartis.com]

Verzonden: woensdag 22 juni 2005 11:25

Aan: Jesse

CC: biojava-l@biojava.org; biojava-l-bounces@portal.open-bio.org

Onderwerp: Re: [Biojava-l] RestrictionEnzymeManager can't correctly handle
incomplete enzymes

 
I take your point but I notice that BamHI is an isoscizomer. Is the cleavage
site of BamHI really unknown??

 
- Mark

 
"Jesse" <jesse-t@chello.nl>

Sent by: biojava-l-bounces@portal.open-bio.org

06/22/2005 04:15 PM

 
        To:     <biojava-l@biojava.org>

        cc:     (bcc: Mark Schreiber/GP/Novartis)

        Subject:        [Biojava-l] RestrictionEnzymeManager can't correctly
handle     incomplete 

enzymes

 
RestrictionEnzymeManager can't correctly handle incomplete enzymes and gives
wrong data.

 
(Correct me if I'm wrong.)

 
I'm not sure if this is already discussed or not.

 
I think RestrictionEnzymeManager can not handle incomplete restriction
enzymes.

 
BioJava 1.4Pre2 knows two types of RestrictionEnzymes:

-RestrictionEnzyme.CUT_SIMPLE

-RestrictionEnzyme.CUT_COMPOUND

 
But in REBASE, there are also other restriction enzyme entries:

-Unknown recognition sites. For example "<3>?". RestrictionEnzymeManager
skips this one (which is ok).

-Unknown cut location. For example AacI "<3>GGATCC".

 
The problem with RestrictionEnzymeManager is with those REBASE entries which
have an unknown cutlocation. RestrictionEnzymeManager  will actually tell
that there is a cutlocation, even though it's unknown in the REBASE file.

 
For example:

http://rebase.neb.com/rebase/link_withrefm

--------- REBASE ENTRY -----------

<1>AacI

<2>BamHI,AaeI,AcaII,AccEBI,AinII,AliI,Ali12257I,Ali12258I,ApaCI,AsiI,AspTII,

Atu1II,BamFI,BamKI,BamNI,Bca1259I,Bce751I,Bco10278I,BnaI,BsaDI,Bsp30I,Bsp46I

,Bsp90II,Bsp98I,Bsp130I,Bsp131I,Bsp144I,Bsp4009I,BspAAIII,BstI,Bst1126I,Bst2

464I,Bst2902I,BstQI,Bsu90I,Bsu8565I,Bsu8646I,BsuB519I,BsuB763I,CelI,DdsI,Gdo

I,GinI,GoxI,GseIII,GstI,MleI,Mlu23I,NasBI,Nsp29132II,NspSAIV,OkrAI,Pac1110I,

Pae177I,Pfl8I,Psp56I,RhsI,Rlu4I,RspLKII,SolI,SpvI,SurI,Uba19I,Uba31I,Uba38I,

Uba51I,Uba88I,Uba1098I,Uba1163I,Uba1167I,Uba1172I,Uba1173I,Uba1205I,Uba1224I

,Uba1242I,Uba1250I,Uba1258I,Uba1297I,Uba1302I,Uba1324I,Uba1325I,Uba1334I,Uba

1339I,Uba1346I,Uba1383I,Uba1398I,Uba1402I,Uba1414I,Uba4009I

<3>GGATCC

<4>

<5>Acetobacter aceti sub. liquefaciens

<6>IFO 12388

<7>

<8>Seurinck, J., van Montagu, M., Unpublished observations.

----------------------------------

 
--------- RestrictionEnzyme values --------

Name: AacI

RecognitionSite:ggatcc

ForwardRegex: g{2}atc{2}

ReverseRegex: g{2}atc{2}

CutType: 0 (RestrictionEnzyme.CUT_SIMPLE)

DownStreamEndType: 2

IsPalindromic: true

DownstreamCut: 1, 1,

-------------------------------------------

 
As you can see, AaCI is used as RestrictionEnzyme.CUT_SIMPLE and it has a
cutlocation while the REBASE entry says that the cutlocation is unknown,
only the recognition site is known. So RestrictionEnzymeManager should also
filter out those with an unknown cutlocation, otherwise it gives wrong data.

 
- Jesse

 
[Biojava-l] RestrictionEnzymeManager REBASE reader bug?

mark.schreiber at novartis.com mark.schreiber at novartis.com Tue Jun 21
22:22:52 EDT 2005 

 
Hello -

 
This is now checked in. All tests pass (no surprise as checking for null
never hurt anyone). This will make it into biojava1.4. If you want to add a
test to the Junit to ensure this stays fixed it would be most appreciated.

 
I also remember some discussion a while back about the behaivour of certain
enzymes with respect to their cleavage points which may or may not 

 
have been a bug. Was this ever resolved? If so does anything need fixing?

 
Thanks.

 
- Mark

 
_______________________________________________

Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

 
From simon.foote at nrc-cnrc.gc.ca  Wed Jun 22 08:51:11 2005
From: simon.foote at nrc-cnrc.gc.ca (Simon Foote)
Date: Wed Jun 22 08:41:52 2005
Subject: [Biojava-l] Re: update seqfeature
In-Reply-To: <42B92E38.2020008@mpiib-berlin.mpg.de>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>	<42B6DEC3.9090807@mpiib-berlin.mpg.de>	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>	<42B7E788.3040205@mpiib-berlin.mpg.de>	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
	<42B81C43.9010404@mpiib-berlin.mpg.de>
	<42B83D31.2000403@nrc-cnrc.gc.ca>
	<42B92E38.2020008@mpiib-berlin.mpg.de>
Message-ID: <42B95EBF.7050403@nrc-cnrc.gc.ca>

Hi Martina,

Biojava should handle that correctly.  I haven't done it by changing a 
feature source, but I have with changing a feature's location and 
strand.  For changing a location:

// Get the Feature you wish to edit
StrandedFeature sf = ex. use a feature filter to grab the feature by it's ID
Location loc = new Location(100, 1100);
sf.setLocation(loc);

Since you have already retrieved the feature to edit, biojava will 
automatically do this as an update and not an insert.  Or it should in 
all cases where you are modifying a pre-existing feature.

Simon

Martina wrote:

> Hi Simon,
>
> I'm changing the FeatureSource and in setFeatureSource an update on 
> the source_term_id happens. In the case the combination is already 
> there, I get an Exception. The proper way to deal with that would be 
> to get the seqfeature_id of the entry already there and use that, or 
> try to update the rank unless its a unique combination? Or should I 
> rather not mess with the BioJava and delete that entry and insert it 
> as new to let BioJava handle the rank increase?
>
> Thanks for any advise
>
> Martina
>
> Simon Foote wrote:
>
>> Hi Martina,
>>
>> In fact you can, as rank is the field that allows this to happen.  In 
>> Biojava, currently it's just a linearily incremented number such that 
>> you can have the same type and source IDs for a given bioentry.
>>
>> For example, adding a Genbank entry with 10 CDS features for 1 
>> bioentry will give you identical keys for bioentry_id, type_term_id 
>> and source_term_id, but will have a rank of 1 - 10 for each.
>>
>> Simon
>>

-- 
Bioinformatics Programmer
Pathogen Genomics
Institute for Biological Sciences
National Research Council of Canada
[T] 613-990-0561  [F] 613-952-9092
simon.foote@nrc-cnrc.gc.ca

From boehme at mpiib-berlin.mpg.de  Wed Jun 22 09:05:44 2005
From: boehme at mpiib-berlin.mpg.de (Martina)
Date: Wed Jun 22 08:57:19 2005
Subject: [Biojava-l] Re: update seqfeature
In-Reply-To: <42B95EBF.7050403@nrc-cnrc.gc.ca>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>	<42B6DEC3.9090807@mpiib-berlin.mpg.de>	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>	<42B7E788.3040205@mpiib-berlin.mpg.de>	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
	<42B81C43.9010404@mpiib-berlin.mpg.de>
	<42B83D31.2000403@nrc-cnrc.gc.ca>
	<42B92E38.2020008@mpiib-berlin.mpg.de>
	<42B95EBF.7050403@nrc-cnrc.gc.ca>
Message-ID: <42B96228.4020100@mpiib-berlin.mpg.de>

Hi Simon,

sorry, I might haven't made that clear enough:
The problem only exists with changing a feature source (or type, but I 
didn't try that) because of the composite unique index in biosql 
seqfeature table, it doesn't check if the location is the same or not, 
but the combination of type, source, bioentry id and rank has to be 
unique. So if I insert a new feature, the rank gets increased by 
BioJava somehow and all is well, but if I update an existing features 
source and hit by accident the same combination as anothers fetures 
type, source, .. I get the exception and the source doesn't change.
At least that is what I suppose is happening.

My question was how to handle this situation?

Martina


Simon Foote wrote:

> Hi Martina,
> 
> Biojava should handle that correctly.  I haven't done it by changing a 
> feature source, but I have with changing a feature's location and 
> strand.  For changing a location:
> 
> // Get the Feature you wish to edit
> StrandedFeature sf = ex. use a feature filter to grab the feature by 
> it's ID
> Location loc = new Location(100, 1100);
> sf.setLocation(loc);
> 
> Since you have already retrieved the feature to edit, biojava will 
> automatically do this as an update and not an insert.  Or it should in 
> all cases where you are modifying a pre-existing feature.
> 

From simon.foote at nrc-cnrc.gc.ca  Wed Jun 22 09:15:54 2005
From: simon.foote at nrc-cnrc.gc.ca (Simon Foote)
Date: Wed Jun 22 09:10:09 2005
Subject: [Biojava-l] Re: update seqfeature
In-Reply-To: <42B96228.4020100@mpiib-berlin.mpg.de>
References: <6D9E9B9DF347EF4385F6271C64FB8D5601DCAB79@BIONIC.biopolis.one-north.com>	<42B6DEC3.9090807@mpiib-berlin.mpg.de>	<f5bb76b54331dc88107ebde4bee3dc46@gnf.org>	<42B7E788.3040205@mpiib-berlin.mpg.de>	<0be3992b92f6a14b6d06d5a06549555b@gnf.org>
	<42B81C43.9010404@mpiib-berlin.mpg.de>
	<42B83D31.2000403@nrc-cnrc.gc.ca>
	<42B92E38.2020008@mpiib-berlin.mpg.de>
	<42B95EBF.7050403@nrc-cnrc.gc.ca>
	<42B96228.4020100@mpiib-berlin.mpg.de>
Message-ID: <42B9648A.5040001@nrc-cnrc.gc.ca>

I get the problem now, that would then be a bug in biojava.  It should 
do an internal check to see if a source/type term change will cause a 
non-unique exception and if so, then also update the rank to the next 
available one.  One solution would be to catch the exception then do a 
select for the max(rank) for the given bioentry_id, source_term_id, 
type_term_id and then increment it by one.

In fact, it would probably be wise to always update the rank when 
changing either the source or type term, so that the ranks stay 
incrementally consistent, if that really matters.

Simon

Martina wrote:

> Hi Simon,
>
> sorry, I might haven't made that clear enough:
> The problem only exists with changing a feature source (or type, but I 
> didn't try that) because of the composite unique index in biosql 
> seqfeature table, it doesn't check if the location is the same or not, 
> but the combination of type, source, bioentry id and rank has to be 
> unique. So if I insert a new feature, the rank gets increased by 
> BioJava somehow and all is well, but if I update an existing features 
> source and hit by accident the same combination as anothers fetures 
> type, source, .. I get the exception and the source doesn't change.
> At least that is what I suppose is happening.
>
> My question was how to handle this situation?
>
> Martina
>
>
> Simon Foote wrote:
>
>> Hi Martina,
>>
>> Biojava should handle that correctly.  I haven't done it by changing 
>> a feature source, but I have with changing a feature's location and 
>> strand.  For changing a location:
>>
>> // Get the Feature you wish to edit
>> StrandedFeature sf = ex. use a feature filter to grab the feature by 
>> it's ID
>> Location loc = new Location(100, 1100);
>> sf.setLocation(loc);
>>
>> Since you have already retrieved the feature to edit, biojava will 
>> automatically do this as an update and not an insert.  Or it should 
>> in all cases where you are modifying a pre-existing feature.
>>

-- 
Bioinformatics Programmer
Pathogen Genomics
Institute for Biological Sciences
National Research Council of Canada
[T] 613-990-0561  [F] 613-952-9092
simon.foote@nrc-cnrc.gc.ca

From jesse-t at chello.nl  Wed Jun 22 11:05:40 2005
From: jesse-t at chello.nl (Jesse)
Date: Wed Jun 22 10:56:58 2005
Subject: [Biojava-l] RestrictionEnzyme can't handle double sites
In-Reply-To: <20050622100859.JIQP1610.amsfep12-int.chello.nl@anonymous>
Message-ID: <20050622150534.FXCZ11463.amsfep13-int.chello.nl@anonymous>

Another problem.

Some Restriction Enzymes have more than one recognition site. Usually this
can be notated by using ambiguous symbols, but some for restriction enzymes
this is not possible because in some cases the ambiguous symbols rely on
each other.

Usually an ambiguous symbol is something like this:
ANNC
The first "N" is independent of the second "N". For example, it can match
with:
AAAC
AACC
AAGC
AATC
....
....
ATTC
16 possibilities. The ambiguous symbols are independent of each other.

But in some restriction enzyme, the ambiguous symbols are dependent of each
other. So for a sequence like
ANNC
Would than only match with:
AAAC
ACCC
AGGC
ATTC
Only 4 possibilities. The ambiguous symbols are dependent of each other.


This happens with these enzymes:
TaqII
M.PhiBssHII (unknown cutlocation)
M.Phi3TI (unknown cutlocation)
M.Rho11sI (unknown cutlocation)
M.SPBetaI (unknown cutlocation)
M.SPRI (unknown cutlocation)

<1>TaqII
<2>
<3>GACCGA(11/9),CACCCA(11/9)
<4>
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>X
<8>Barker, D., Hoff, M., Oliphant, A., White, R., (1984) Nucleic Acids Res.,
vol. 12, pp. 5567-5581.
Myers, P.A., Roberts, R.J., Unpublished observations.
Rutkowska, S.M., Jaworowska, I., Skowron, P.M., Unpublished observations.


RestrictionEnzymeManager takes the last recognition site in this example, it
skips GACCGA.

Name: TaqII
RecognitionSite:caccca
ForwardRegex: cac{3}a
ReverseRegex: tg{3}tg
CutType: 0
DownStreamEndType: 0
IsPalindromic: false
DownstreamCut: 17, 15,


- Jesse


-----Oorspronkelijk bericht-----
Van: biojava-l-bounces@portal.open-bio.org
[mailto:biojava-l-bounces@portal.open-bio.org] Namens Jesse
Verzonden: woensdag 22 juni 2005 12:09
Aan: biojava-l@biojava.org
Onderwerp: RE: [Biojava-l] RestrictionEnzymeManager can't correctlyhandle
incomplete enzymes

(I'm not an expert on restriction enzymes.)

I was talking about AacI, of which BamHI is an isoschizomer. The recognition
site of AacI is unknown, but the one from BamHI is known. Maybe
RestrictionEnzymeManager uses the cutlocation of BamHI when asking the
unknown cutlocation of AacI.

http://rebase.neb.com/rebase/enz/AacI.html

That might also be the reason why RestrictionEnzymeManager requires links
between restriction enzymes. If a restriction enzyme entry is removed from
the REBASE file RestrictionEnzymeManager fails to read in some cases.

But I think using cutlocation of isoschizomers is wrong. Because of this:

REBASE says: "A isoschizomers is a restriction enzymes that recognize the
same DNA sequence. The cut sites may or may not be identical."

So the cut site might be different between different isoschizomers.

I searched for examples in the REBASE file, and found them:


<1>BspKT6I
<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstMBI,
BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1786
I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,CcoP
31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338I,C
paI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,FnuAII
,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I,Ls
p1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,Mk
rAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,N
deII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,PfaI
,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI,
SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu247
9I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,R
2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I,
TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Uba
1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>GAT^C
<4>2(6)
<5>Bacillus species KT6
<6>N.I. Matvienko
<7>
<8>Shapovalova, N.I., Zheleznaja, L.A., Matvienko, N.I., (1993) Nucleic
Acids Res., vol. 21, pp. 5794.
Shapovalova, N.I., Zheleznaya, L.A., Matvienko, N.I., (1994) Biokhimiia,
vol. 59, pp. 1730-1738.

<1>MboI
<2>AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscFI,Bsm
XII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,Bsp60
I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105I,Bsp
122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI,BspJ
64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstM
BI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1
786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,C
coP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338
I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,Fnu
AII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I
,Lsp1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII
,MkrAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciA
I,NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,P
faI,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,Sau
EI,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu
2479I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074
I,R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth36
8I,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,
Uba1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>^GATC
<4>2(6)
<5>Moraxella bovis
<6>ATCC 10900
<7>ACFGKNQRUVX
<8>Anton, B.P., Brooks, J.E., Unpublished observations.
Gelinas, R.E., Myers, P.A., Roberts, R.J., (1977) J. Mol. Biol., vol. 114,
pp. 169-179.
Huang, L.-H., Farnet, C.M., Ehrlich, K.C., Ehrlich, M., (1982) Nucleic Acids
Res., vol. 10, pp. 1579-1591.
Ueno, T., Ito, H., Kimizuka, F., Kotani, H., Nakajima, K., (1993) Nucleic
Acids Res., vol. 21, pp. 2309-2313.
Ueno, T., Ito, H., Kotani, H., Nakajima, K., Japanese Patent Office, 1993.

 
<1>Mel3JI
<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI
,BstMBI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I
,Bth1786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,C
acI,CcoP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,Cj
eP338I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHC
I,FnuAII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,Ll
aKR2I,Lsp1109II,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,M
krAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,
NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,Pfa
I,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI
,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu24
79I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,
R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I
,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Ub
a1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>GATC
<4>
<5>Megasphaera elsedenii 3J
<6>P. Pristas
<7>
<8>Piknova, M., Filova, M., Javorsky, P., Pristas, P., (2004) FEMS
Microbiol. Lett., vol. 236, pp. 91-95.
Piknova, M., Pristas, P., Javorsky, P., (2004) Folia Microbiol. (Praha),
vol. 49, pp. 191-193.


-----Oorspronkelijk bericht-----

Van: mark.schreiber@novartis.com [ <mailto:mark.schreiber@novartis.com>
mailto:mark.schreiber@novartis.com]

Verzonden: woensdag 22 juni 2005 11:25

Aan: Jesse

CC: biojava-l@biojava.org; biojava-l-bounces@portal.open-bio.org

Onderwerp: Re: [Biojava-l] RestrictionEnzymeManager can't correctly handle
incomplete enzymes

 
I take your point but I notice that BamHI is an isoscizomer. Is the cleavage
site of BamHI really unknown??

 
- Mark

 
"Jesse" <jesse-t@chello.nl>

Sent by: biojava-l-bounces@portal.open-bio.org

06/22/2005 04:15 PM

 
        To:     <biojava-l@biojava.org>

        cc:     (bcc: Mark Schreiber/GP/Novartis)

        Subject:        [Biojava-l] RestrictionEnzymeManager can't correctly
handle     incomplete 

enzymes

 
RestrictionEnzymeManager can't correctly handle incomplete enzymes and gives
wrong data.

 
(Correct me if I'm wrong.)

 
I'm not sure if this is already discussed or not.

 
I think RestrictionEnzymeManager can not handle incomplete restriction
enzymes.

 
BioJava 1.4Pre2 knows two types of RestrictionEnzymes:

-RestrictionEnzyme.CUT_SIMPLE

-RestrictionEnzyme.CUT_COMPOUND

 
But in REBASE, there are also other restriction enzyme entries:

-Unknown recognition sites. For example "<3>?". RestrictionEnzymeManager
skips this one (which is ok).

-Unknown cut location. For example AacI "<3>GGATCC".

 
The problem with RestrictionEnzymeManager is with those REBASE entries which
have an unknown cutlocation. RestrictionEnzymeManager  will actually tell
that there is a cutlocation, even though it's unknown in the REBASE file.

 
For example:

http://rebase.neb.com/rebase/link_withrefm

--------- REBASE ENTRY -----------

<1>AacI

<2>BamHI,AaeI,AcaII,AccEBI,AinII,AliI,Ali12257I,Ali12258I,ApaCI,AsiI,AspTII,

Atu1II,BamFI,BamKI,BamNI,Bca1259I,Bce751I,Bco10278I,BnaI,BsaDI,Bsp30I,Bsp46I

,Bsp90II,Bsp98I,Bsp130I,Bsp131I,Bsp144I,Bsp4009I,BspAAIII,BstI,Bst1126I,Bst2

464I,Bst2902I,BstQI,Bsu90I,Bsu8565I,Bsu8646I,BsuB519I,BsuB763I,CelI,DdsI,Gdo

I,GinI,GoxI,GseIII,GstI,MleI,Mlu23I,NasBI,Nsp29132II,NspSAIV,OkrAI,Pac1110I,

Pae177I,Pfl8I,Psp56I,RhsI,Rlu4I,RspLKII,SolI,SpvI,SurI,Uba19I,Uba31I,Uba38I,

Uba51I,Uba88I,Uba1098I,Uba1163I,Uba1167I,Uba1172I,Uba1173I,Uba1205I,Uba1224I

,Uba1242I,Uba1250I,Uba1258I,Uba1297I,Uba1302I,Uba1324I,Uba1325I,Uba1334I,Uba

1339I,Uba1346I,Uba1383I,Uba1398I,Uba1402I,Uba1414I,Uba4009I

<3>GGATCC

<4>

<5>Acetobacter aceti sub. liquefaciens

<6>IFO 12388

<7>

<8>Seurinck, J., van Montagu, M., Unpublished observations.

----------------------------------

 
--------- RestrictionEnzyme values --------

Name: AacI

RecognitionSite:ggatcc

ForwardRegex: g{2}atc{2}

ReverseRegex: g{2}atc{2}

CutType: 0 (RestrictionEnzyme.CUT_SIMPLE)

DownStreamEndType: 2

IsPalindromic: true

DownstreamCut: 1, 1,

-------------------------------------------

 
As you can see, AaCI is used as RestrictionEnzyme.CUT_SIMPLE and it has a
cutlocation while the REBASE entry says that the cutlocation is unknown,
only the recognition site is known. So RestrictionEnzymeManager should also
filter out those with an unknown cutlocation, otherwise it gives wrong data.

 
- Jesse

 
[Biojava-l] RestrictionEnzymeManager REBASE reader bug?

mark.schreiber at novartis.com mark.schreiber at novartis.com Tue Jun 21
22:22:52 EDT 2005 

 
Hello -

 
This is now checked in. All tests pass (no surprise as checking for null
never hurt anyone). This will make it into biojava1.4. If you want to add a
test to the Junit to ensure this stays fixed it would be most appreciated.

 
I also remember some discussion a while back about the behaivour of certain
enzymes with respect to their cleavage points which may or may not 

 
have been a bug. Was this ever resolved? If so does anything need fixing?

 
Thanks.

 
- Mark

 
_______________________________________________

Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

 
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

From patrick at bennour.de  Wed Jun 22 11:43:58 2005
From: patrick at bennour.de (Patrick Bennour)
Date: Wed Jun 22 11:35:21 2005
Subject: [Biojava-l] Looking for an Application to visualize Promoter
	Prediction results
Message-ID: <002a01c57741$354a0470$2101a8c0@windowsxp>

Dear All,

I am looking for an application that does at least some of the following.

Input: different promoter prediction analysis programs (like CpgProD, Eponine, FirstEF, McPromoter)
The application should then
- automatically parse the results
- visualize the results in an graphical diagram,
  that contains the input sequence
- visualize the different predictions in an comparative diagram
- combine some predictions to improve prediction quality

Thanks for your suggestions

From kturner at idtdna.com  Wed Jun 22 12:07:20 2005
From: kturner at idtdna.com (Keith Turner)
Date: Wed Jun 22 11:58:58 2005
Subject: [Biojava-l] Using SeqIOTools in a JNLP context
Message-ID: <03D1119D99B98D4D9762E01F1D4FB980010FA832@EXCHANGE.idtdna.com>

I've done that, and accepted the permissions, but it still doesn't seem to like having streams passed between classes.  It works fine if I am working with the stream in the same method that I got it in (by creating the FileOpenService), but when I try to pass a FileContents, or its associated InputStream or Readers as a parameter in a method call, it does not like it.  For example, when trying to write data to a file, the file will get created, but no data is written to it.  Maybe this is a more appropriate question for the JNLP developer community, but if any of you have any insight I'd appreciate it.  Thanks for your reply, Andreas.


-----Original Message-----
From:	biojava-l-bounces@portal.open-bio.org on behalf of Andreas Prlic
Sent:	Tue 6/21/2005 5:08 PM
To:	<biojava-l@biojava.org> <biojava-l@biojava.org>
Cc:	
Subject:	Re: [Biojava-l] Using SeqIOTools in a JNLP context
Hi Keith,

You should get  an java.security.AccessControlException: access denied  
from webstart.
To access the filesystem from an application started with webstart 
requires special permission. This means you have to sign your 
application and the user has to permit the execution.

see e.g.
http://java.sun.com/docs/books/tutorial/security1.2/toolsign/signer.html
Cheers,
Andreas

On 21 Jun 2005, at 20:17, Keith Turner wrote:

> Hello-
>
> I am new to the list.  I enjoy working with the Biojava API, but a 
> problem has arisen for me, and I need some help with it.  I am 
> developing an application to be used in the Java Webstart framework, 
> and this brings with it some interesting file permission issues.  
> Basically, you use the JNLP interface FileOpenService to open a file 
> from within the secure "sandbox" environment, and then you can get an 
> InputStream out of that.
>
> So I want to take this InputStream (which presumably is from a Fasta 
> file), and read a DNA sequence from it.  However, all the methods that 
> worked when I was running my software as a Java application no longer 
> work in the JNLP environment.  In the past, I was doing:
>   InputStreamReader fr = new InputStreamReader(in);
>   BufferedReader br = new BufferedReader(fr);
>   SequenceIterator stream = SeqIOTools.readFastaDNA(br);
>   Sequence seq = stream.nextSequence();
> But the program freezes on the SeqIOTools.readFastaDNA(br) call.  No 
> exception is thrown back, it just does nothing.  Does anyone have any 
> suggestions as to how I can solve or work around this problem?  Thank 
> you very much
>
> -Keith Turner
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
>
-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
			 +44 (0) 1223 49 6891

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From heuermh at acm.org  Wed Jun 22 12:31:48 2005
From: heuermh at acm.org (Michael Heuer)
Date: Wed Jun 22 12:24:30 2005
Subject: [Biojava-l] RestrictionEnzymeManager REBASE reader bug?
In-Reply-To: <OF3DA5F2C6.96D9383E-ON48257028.00042D20-48257028.00047BE8@EU.novartis.net>
Message-ID: <Pine.GSO.4.44.0506221228490.20312-100000@shell3.shore.net>


On Wed, 22 Jun 2005 mark.schreiber@novartis.com wrote:

> Oops. I was supposed to check that in.
>
> A bug tracking feature would be nice although I fear that the number of
> hands available to fix those tracked bugs might be severely limiting. If
> people know of good and free systems I could reccomend them to the
> open-bio admins.
>
> There was some talk of a searchable mail archive a while ago (although
> google seems to do a pretty good job of indexing our mail). I'll try and
> follow it up.

I intend to speak to the open-bio folks while here at the BOSC conference
about an open-bio subversion repository and installing bugzilla or a
derivation thereof.

Any other admin-related issues?

   michael

From mark.schreiber at novartis.com  Wed Jun 22 21:01:12 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Wed Jun 22 20:52:45 2005
Subject: [Biojava-l] RestrictionEnzyme can't handle double sites
Message-ID: <OF596C3FFF.17F3B251-ON48257029.000590EF-48257029.00059B06@EU.novartis.net>

What would be your reccomended solution to this problem?


"Jesse" <jesse-t@chello.nl>
Sent by: biojava-l-bounces@portal.open-bio.org
06/22/2005 11:05 PM

 
        To:     <biojava-l@biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzyme can't handle double sites


Another problem.

Some Restriction Enzymes have more than one recognition site. Usually this
can be notated by using ambiguous symbols, but some for restriction 
enzymes
this is not possible because in some cases the ambiguous symbols rely on
each other.

Usually an ambiguous symbol is something like this:
ANNC
The first "N" is independent of the second "N". For example, it can match
with:
AAAC
AACC
AAGC
AATC
....
....
ATTC
16 possibilities. The ambiguous symbols are independent of each other.

But in some restriction enzyme, the ambiguous symbols are dependent of 
each
other. So for a sequence like
ANNC
Would than only match with:
AAAC
ACCC
AGGC
ATTC
Only 4 possibilities. The ambiguous symbols are dependent of each other.


This happens with these enzymes:
TaqII
M.PhiBssHII (unknown cutlocation)
M.Phi3TI (unknown cutlocation)
M.Rho11sI (unknown cutlocation)
M.SPBetaI (unknown cutlocation)
M.SPRI (unknown cutlocation)

<1>TaqII
<2>
<3>GACCGA(11/9),CACCCA(11/9)
<4>
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>X
<8>Barker, D., Hoff, M., Oliphant, A., White, R., (1984) Nucleic Acids 
Res.,
vol. 12, pp. 5567-5581.
Myers, P.A., Roberts, R.J., Unpublished observations.
Rutkowska, S.M., Jaworowska, I., Skowron, P.M., Unpublished observations.


RestrictionEnzymeManager takes the last recognition site in this example, 
it
skips GACCGA.

Name: TaqII
RecognitionSite:caccca
ForwardRegex: cac{3}a
ReverseRegex: tg{3}tg
CutType: 0
DownStreamEndType: 0
IsPalindromic: false
DownstreamCut: 17, 15,


- Jesse


-----Oorspronkelijk bericht-----
Van: biojava-l-bounces@portal.open-bio.org
[mailto:biojava-l-bounces@portal.open-bio.org] Namens Jesse
Verzonden: woensdag 22 juni 2005 12:09
Aan: biojava-l@biojava.org
Onderwerp: RE: [Biojava-l] RestrictionEnzymeManager can't correctlyhandle
incomplete enzymes

(I'm not an expert on restriction enzymes.)

I was talking about AacI, of which BamHI is an isoschizomer. The 
recognition
site of AacI is unknown, but the one from BamHI is known. Maybe
RestrictionEnzymeManager uses the cutlocation of BamHI when asking the
unknown cutlocation of AacI.

http://rebase.neb.com/rebase/enz/AacI.html

That might also be the reason why RestrictionEnzymeManager requires links
between restriction enzymes. If a restriction enzyme entry is removed from
the REBASE file RestrictionEnzymeManager fails to read in some cases.

But I think using cutlocation of isoschizomers is wrong. Because of this:

REBASE says: "A isoschizomers is a restriction enzymes that recognize the
same DNA sequence. The cut sites may or may not be identical."

So the cut site might be different between different isoschizomers.

I searched for examples in the REBASE file, and found them:


<1>BspKT6I
<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstMBI,
BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1786
I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,CcoP
31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338I,C
paI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,FnuAII
,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I,Ls
p1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,Mk
rAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,N
deII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,PfaI
,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI,
SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu247
9I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,R
2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I,
TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Uba
1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>GAT^C
<4>2(6)
<5>Bacillus species KT6
<6>N.I. Matvienko
<7>
<8>Shapovalova, N.I., Zheleznaja, L.A., Matvienko, N.I., (1993) Nucleic
Acids Res., vol. 21, pp. 5794.
Shapovalova, N.I., Zheleznaya, L.A., Matvienko, N.I., (1994) Biokhimiia,
vol. 59, pp. 1730-1738.

<1>MboI
<2>AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscFI,Bsm
XII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,Bsp60
I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105I,Bsp
122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI,BspJ
64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI,BstM
BI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I,Bth1
786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,CacI,C
coP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,CjeP338
I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHCI,Fnu
AII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,LlaKR2I
,Lsp1109II,Mel3JI,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII
,MkrAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciA
I,NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,P
faI,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,Sau
EI,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu
2479I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074
I,R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth36
8I,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,
Uba1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>^GATC
<4>2(6)
<5>Moraxella bovis
<6>ATCC 10900
<7>ACFGKNQRUVX
<8>Anton, B.P., Brooks, J.E., Unpublished observations.
Gelinas, R.E., Myers, P.A., Roberts, R.J., (1977) J. Mol. Biol., vol. 114,
pp. 169-179.
Huang, L.-H., Farnet, C.M., Ehrlich, K.C., Ehrlich, M., (1982) Nucleic 
Acids
Res., vol. 10, pp. 1579-1591.
Ueno, T., Ito, H., Kimizuka, F., Kotani, H., Nakajima, K., (1993) Nucleic
Acids Res., vol. 21, pp. 2309-2313.
Ueno, T., Ito, H., Kotani, H., Nakajima, K., Japanese Patent Office, 1993.

 
<1>Mel3JI
<2>MboI,AspMDI,AsuMBI,Bce243I,Bfi57I,BfiSHI,BfuCI,Bme12I,Bme2494I,BsaPI,BscF
I,BsmXII,BspI,Bsp9I,Bsp18I,Bsp49I,Bsp51I,Bsp52I,Bsp54I,Bsp57I,Bsp58I,Bsp59I,
Bsp60I,Bsp61I,Bsp64I,Bsp65I,Bsp66I,Bsp67I,Bsp72I,Bsp74I,Bsp76I,Bsp91I,Bsp105
I,Bsp122I,Bsp135I,Bsp136I,Bsp138I,Bsp143I,Bsp147I,Bsp2095I,BspAI,BspFI,BspJI
,BspJ64I,BspKT6I,BsrMI,BsrPII,BssGII,Bst19II,Bst1274I,BstEIII,BstENII,BstKTI
,BstMBI,BstXII,BtcI,Bth84I,Bth211I,Bth213I,Bth221I,Bth945I,Bth1140I,Bth1141I
,Bth1786I,Bth1997I,BthCanI,BtkII,Btu33I,Btu34I,Btu36I,Btu37I,Btu39I,Btu41I,C
acI,CcoP31I,CcoP76I,CcoP84I,CcoP95II,CcoP219I,CcyI,CdiCD6II,ChaI,Cin1467I,Cj
eP338I,CpaI,CpfI,CpfAI,Csp5I,Cte1179I,Cte1180I,CtyI,CviAI,CviHI,DpnII,EsaLHC
I,FnuAII,FnuCI,FnuEI,Gst1588II,HacI,HpyAIII,HpyHPK5II,Kzo9I,LlaAI,LlaDCHI,Ll
aKR2I,Lsp1109II,Mel5JI,Mel7JI,Mel4OI,Mel5OI,Mel2TI,Mel5TI,MeuI,MgoI,MjaIII,M
krAI,MmeII,Mmu5I,MmuP2I,MnoIII,MosI,Msp67II,MspBI,MthI,Mth1047I,MthAI,NciAI,
NdeII,NflI,NflAII,NflBI,NlaII,NlaDI,NmeCI,NphI,NsiAI,NspAI,NsuI,Pei9403I,Pfa
I,Pph288I,RalF40I,Rlu1I,SalAI,SalHI,Sau15I,Sau6782I,Sau3AI,SauCI,SauDI,SauEI
,SauFI,SauGI,SauMI,SinMI,SmiMBI,SsiAI,SsiBI,Ssu211I,Ssu212I,Ssu220I,R1.Ssu24
79I,R2.Ssu2479I,R1.Ssu4109I,R2.Ssu4109I,R1.Ssu4961I,R2.Ssu4961I,R1.Ssu8074I,
R2.Ssu8074I,R1.Ssu11318I,R2.Ssu11318I,R1.SsuDAT1I,R2.SsuDAT1I,SsuRBI,Sth368I
,TrsKTI,TrsSI,TrsTI,TruII,Tsp133I,Uba4I,Uba59I,Uba1101I,Uba1177I,Uba1182I,Ub
a1183I,Uba1204I,Uba1259I,Uba1317I,Uba1323I,Uba1366I,Vha44I
<3>GATC
<4>
<5>Megasphaera elsedenii 3J
<6>P. Pristas
<7>
<8>Piknova, M., Filova, M., Javorsky, P., Pristas, P., (2004) FEMS
Microbiol. Lett., vol. 236, pp. 91-95.
Piknova, M., Pristas, P., Javorsky, P., (2004) Folia Microbiol. (Praha),
vol. 49, pp. 191-193.


-----Oorspronkelijk bericht-----

Van: mark.schreiber@novartis.com [ <mailto:mark.schreiber@novartis.com>
mailto:mark.schreiber@novartis.com]

Verzonden: woensdag 22 juni 2005 11:25

Aan: Jesse

CC: biojava-l@biojava.org; biojava-l-bounces@portal.open-bio.org

Onderwerp: Re: [Biojava-l] RestrictionEnzymeManager can't correctly handle
incomplete enzymes

 
I take your point but I notice that BamHI is an isoscizomer. Is the 
cleavage
site of BamHI really unknown??

 
- Mark

 
"Jesse" <jesse-t@chello.nl>

Sent by: biojava-l-bounces@portal.open-bio.org

06/22/2005 04:15 PM

 
        To:     <biojava-l@biojava.org>

        cc:     (bcc: Mark Schreiber/GP/Novartis)

        Subject:        [Biojava-l] RestrictionEnzymeManager can't 
correctly
handle     incomplete 

enzymes

 
RestrictionEnzymeManager can't correctly handle incomplete enzymes and 
gives
wrong data.

 
(Correct me if I'm wrong.)

 
I'm not sure if this is already discussed or not.

 
I think RestrictionEnzymeManager can not handle incomplete restriction
enzymes.

 
BioJava 1.4Pre2 knows two types of RestrictionEnzymes:

-RestrictionEnzyme.CUT_SIMPLE

-RestrictionEnzyme.CUT_COMPOUND

 
But in REBASE, there are also other restriction enzyme entries:

-Unknown recognition sites. For example "<3>?". RestrictionEnzymeManager
skips this one (which is ok).

-Unknown cut location. For example AacI "<3>GGATCC".

 
The problem with RestrictionEnzymeManager is with those REBASE entries 
which
have an unknown cutlocation. RestrictionEnzymeManager  will actually tell
that there is a cutlocation, even though it's unknown in the REBASE file.

 
For example:

http://rebase.neb.com/rebase/link_withrefm

--------- REBASE ENTRY -----------

<1>AacI

<2>BamHI,AaeI,AcaII,AccEBI,AinII,AliI,Ali12257I,Ali12258I,ApaCI,AsiI,AspTII,

Atu1II,BamFI,BamKI,BamNI,Bca1259I,Bce751I,Bco10278I,BnaI,BsaDI,Bsp30I,Bsp46I

,Bsp90II,Bsp98I,Bsp130I,Bsp131I,Bsp144I,Bsp4009I,BspAAIII,BstI,Bst1126I,Bst2

464I,Bst2902I,BstQI,Bsu90I,Bsu8565I,Bsu8646I,BsuB519I,BsuB763I,CelI,DdsI,Gdo

I,GinI,GoxI,GseIII,GstI,MleI,Mlu23I,NasBI,Nsp29132II,NspSAIV,OkrAI,Pac1110I,

Pae177I,Pfl8I,Psp56I,RhsI,Rlu4I,RspLKII,SolI,SpvI,SurI,Uba19I,Uba31I,Uba38I,

Uba51I,Uba88I,Uba1098I,Uba1163I,Uba1167I,Uba1172I,Uba1173I,Uba1205I,Uba1224I

,Uba1242I,Uba1250I,Uba1258I,Uba1297I,Uba1302I,Uba1324I,Uba1325I,Uba1334I,Uba

1339I,Uba1346I,Uba1383I,Uba1398I,Uba1402I,Uba1414I,Uba4009I

<3>GGATCC

<4>

<5>Acetobacter aceti sub. liquefaciens

<6>IFO 12388

<7>

<8>Seurinck, J., van Montagu, M., Unpublished observations.

----------------------------------

 
--------- RestrictionEnzyme values --------

Name: AacI

RecognitionSite:ggatcc

ForwardRegex: g{2}atc{2}

ReverseRegex: g{2}atc{2}

CutType: 0 (RestrictionEnzyme.CUT_SIMPLE)

DownStreamEndType: 2

IsPalindromic: true

DownstreamCut: 1, 1,

-------------------------------------------

 
As you can see, AaCI is used as RestrictionEnzyme.CUT_SIMPLE and it has a
cutlocation while the REBASE entry says that the cutlocation is unknown,
only the recognition site is known. So RestrictionEnzymeManager should 
also
filter out those with an unknown cutlocation, otherwise it gives wrong 
data.

 
- Jesse

 
[Biojava-l] RestrictionEnzymeManager REBASE reader bug?

mark.schreiber at novartis.com mark.schreiber at novartis.com Tue Jun 21
22:22:52 EDT 2005 

 
Hello -

 
This is now checked in. All tests pass (no surprise as checking for null
never hurt anyone). This will make it into biojava1.4. If you want to add 
a
test to the Junit to ensure this stays fixed it would be most appreciated.

 
I also remember some discussion a while back about the behaivour of 
certain
enzymes with respect to their cleavage points which may or may not 

 
have been a bug. Was this ever resolved? If so does anything need fixing?

 
Thanks.

 
- Mark

 
_______________________________________________

Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

 
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From Yudong.Sun at newcastle.ac.uk  Sun Jun 26 05:42:08 2005
From: Yudong.Sun at newcastle.ac.uk (Y D Sun)
Date: Sun Jun 26 05:34:14 2005
Subject: [Biojava-l] BLAST Parser for extracting all BLAST data?
Message-ID: <E4258311FAA94940A57C29311D1165BDF547C5@largo.campus.ncl.ac.uk>

Hi,

I want to extract all data from BLASTP results. In the following hit,
for example, I need to get the lengths of query and subject proteins,
the identities (including all data 54, 124 and 43%), the positives (all
data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
BLASTLikeSAXParser filter all these information? I can't find the
methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs to
retrieve these data. Does Biojava provide any methods for this purpose?

Thanks,

George


BLASTP 2.2.5 [Nov-16-2002]

Query= Prot0001
         (138 letters)

Database: /work/nys1/fasta/protein/AE000782.pro.fasta
           2407 sequences; 662,866 total letters

Searching.....done

                                                                 Score
E
Sequences producing significant alignments:                      (bits)
Value

Prot0002                                                           100
1e-23
Prot0003                                                            74
2e-15
Prot0004                                                            43
3e-06

>Prot0002
          Length = 138

 Score =  100 bits (250), Expect = 1e-23
 Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124 (2%)

Query: 18  NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
77
           NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D D
Sbjct: 15  NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
74

Query: 78  FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
134
             K+++EL+  +    ++ + GDH IM   I K   +L EI+  +  ++GVKRVCP+II
Sbjct: 75  LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
134

Query: 135 DQIK 138
           D +K
Sbjct: 135 DIVK 138

From hollandr at gis.a-star.edu.sg  Sun Jun 26 11:06:40 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Sun Jun 26 10:59:35 2005
Subject: [Biojava-l] RE: [BioSQL-l] update seqfeature 
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56E562B5@BIONIC.biopolis.one-north.com>

Actually, BioJava is not that clever. Yet. Martina's original observation is right, in that the correct way to do this would be to check the database to see if the altered seqfeature already existed, and if it did, to refer to that one instead. But this is not the way BioJava does things at present. A fix for this will probably end up being built in to the replacement BioJava/BioSQL classes currently in progress, but for now, to delete/create the feature is probably the best workaround.

cheers,
Richard


-----Original Message-----
From:	biosql-l-bounces@portal.open-bio.org on behalf of Martina
Sent:	Wed 6/22/2005 5:24 PM
To:	simon.foote@nrc-cnrc.gc.ca
Cc:	biosql-l-bounces@portal.open-bio.org; BioJava; biosql-l@open-bio.org
Subject:	[BioSQL-l] update seqfeature 

Hi Simon,

I'm changing the FeatureSource and in setFeatureSource an update on 
the source_term_id happens. In the case the combination is already 
there, I get an Exception. The proper way to deal with that would be 
to get the seqfeature_id of the entry already there and use that, or 
try to update the rank unless its a unique combination? Or should I 
rather not mess with the BioJava and delete that entry and insert it 
as new to let BioJava handle the rank increase?

Thanks for any advise

Martina

Simon Foote wrote:

> Hi Martina,
> 
> In fact you can, as rank is the field that allows this to happen.  In 
> Biojava, currently it's just a linearily incremented number such that 
> you can have the same type and source IDs for a given bioentry.
> 
> For example, adding a Genbank entry with 10 CDS features for 1 bioentry 
> will give you identical keys for bioentry_id, type_term_id and 
> source_term_id, but will have a rank of 1 - 10 for each.
> 
> Simon
> 

_______________________________________________
BioSQL-l mailing list
BioSQL-l@open-bio.org
http://open-bio.org/mailman/listinfo/biosql-l


From hollandr at gis.a-star.edu.sg  Sun Jun 26 11:11:30 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Sun Jun 26 11:04:09 2005
Subject: [Biojava-l] Re: [BioSQL-l] _removeSequence
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56E562B6@BIONIC.biopolis.one-north.com>

The revamped BioJava/BioSQL classes will expose the rank to the user for all tables which have ranks.

cheers,
Richard


-----Original Message-----
From:	biosql-l-bounces@portal.open-bio.org on behalf of Simon Foote
Sent:	Wed 6/22/2005 12:15 AM
To:	Martina
Cc:	Hilmar Lapp; biosql-l-bounces@portal.open-bio.org; BioJava; biosql-l@open-bio.org
Subject:	Re: [Biojava-l] Re: [BioSQL-l] _removeSequence

Hi Martina,

In fact you can, as rank is the field that allows this to happen.  In 
Biojava, currently it's just a linearily incremented number such that 
you can have the same type and source IDs for a given bioentry.

For example, adding a Genbank entry with 10 CDS features for 1 bioentry 
will give you identical keys for bioentry_id, type_term_id and 
source_term_id, but will have a rank of 1 - 10 for each.

Simon

Martina wrote:

> That means, that I can't have 2 features refering to the same bioentry 
> with the same type (= type_term_id)and source (=source_term_id) but 
> different parent features because of the composite key bioentry_id in 
> the seqfeature table? Or what does "rank" in that table mean (its part 
> of that key), how can I get different ranks?
>
> Martina
>
> Hilmar Lapp wrote:
>
>> The Biojava people will respond to this. Note though that 
>> Term_Relationship is for storing subject-predicate-object triples of 
>> terms, so I'm not sure why you want to use it for storing/associating 
>> annotation. Maybe you meant bioentry_qualifier_value?
>>
>>     -hilmar
>>
>> On Jun 21, 2005, at 3:10 AM, Martina wrote:
>>
>>>
>>>> Yes. When you insert a sequence you must be prepared that when 
>>>> inserting its ontology term or tag/value annotation the term may 
>>>> already be present because another bioentry uses it too.
>>>
>>>
>>>
>>> Ok, the proper way is to catch the SQLException in BIOSQLFeature, 
>>> test if it is a Dublicate key entry, get the identifier of the term 
>>> (would that be the BioSQLfeatureId ?) and insert it in the 
>>> term_relationship table? And there is no nice BioJava method for 
>>> this, I have to do it "manually", like conn.prepareStatement(..) and 
>>> stuff?  BioJava spoiled me so!
>>>
>>> Martina
>>>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l


-- 
Bioinformatics Programmer
Pathogen Genomics
Institute for Biological Sciences
National Research Council of Canada
[T] 613-990-0561  [F] 613-952-9092
simon.foote@nrc-cnrc.gc.ca

_______________________________________________
BioSQL-l mailing list
BioSQL-l@open-bio.org
http://open-bio.org/mailman/listinfo/biosql-l


From hollandr at gis.a-star.edu.sg  Sun Jun 26 11:33:14 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Sun Jun 26 11:26:10 2005
Subject: [Biojava-l] BLAST Parser for extracting all BLAST data?
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56E562BB@BIONIC.biopolis.one-north.com>

BioJava's BLAST framework parses files and fires events for every piece of information it finds. The SeqSimilarityAdapter class is an example of how to catch these events and construct basic BLAST result objects (SimpleSeqSimilarityHit), however they are not comprehensive and do not record full details of every hit.

If you want the kind of detail you mention below you will have to write your own content handler for BLAST parsing and parse it to the BLASTLikeSAXParser when parsing a file. This event handler should implement the ContentHandler interface. Look at the source of SeqSimilarityAdapter for guidance. You will then receive events for every part of the file, from which you can construct your own custom BLAST result objects to describe them.

If you're not sure what tag names to listen for in your ContentHandler the easiest thing to do is just run it once and dump them all out to see what you get.

cheers,
Richard


-----Original Message-----
From:	biojava-l-bounces@portal.open-bio.org on behalf of Y D Sun
Sent:	Sun 6/26/2005 5:42 PM
To:	biojava-l@biojava.org
Cc:	
Subject:	[Biojava-l] BLAST Parser for extracting all BLAST data?

Hi,

I want to extract all data from BLASTP results. In the following hit,
for example, I need to get the lengths of query and subject proteins,
the identities (including all data 54, 124 and 43%), the positives (all
data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
BLASTLikeSAXParser filter all these information? I can't find the
methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs to
retrieve these data. Does Biojava provide any methods for this purpose?

Thanks,

George


BLASTP 2.2.5 [Nov-16-2002]

Query= Prot0001
         (138 letters)

Database: /work/nys1/fasta/protein/AE000782.pro.fasta
           2407 sequences; 662,866 total letters

Searching.....done

                                                                 Score
E
Sequences producing significant alignments:                      (bits)
Value

Prot0002                                                           100
1e-23
Prot0003                                                            74
2e-15
Prot0004                                                            43
3e-06

>Prot0002
          Length = 138

 Score =  100 bits (250), Expect = 1e-23
 Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124 (2%)

Query: 18  NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
77
           NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D D
Sbjct: 15  NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
74

Query: 78  FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
134
             K+++EL+  +    ++ + GDH IM   I K   +L EI+  +  ++GVKRVCP+II
Sbjct: 75  LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
134

Query: 135 DQIK 138
           D +K
Sbjct: 135 DIVK 138

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From tblum at andrew.cmu.edu  Sun Jun 26 12:10:23 2005
From: tblum at andrew.cmu.edu (Tal Blum)
Date: Sun Jun 26 12:01:41 2005
Subject: [Biojava-l] Psi-Blast results
Message-ID: <200506261610.j5QGAOGk016520@smtp.andrew.cmu.edu>

Hi,

I need to get Psi-Blast results for a large dataset of proteins. Does anyone
know if there are any free tools or classes to do that?

Thanks, Tal

From hollandr at gis.a-star.edu.sg  Sun Jun 26 14:55:22 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Sun Jun 26 14:47:45 2005
Subject: [Biojava-l] Psi-Blast results
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D56E562BD@BIONIC.biopolis.one-north.com>

The standard BLAST parser in BioJava cannot understand PsiBLAST output as far as I'm aware. I haven't used PsiBLAST much so I don't know if you can change its output format, but if you can persuade it to output its results in NCBI BLAST format instead, then you might have more luck.

BioPerl most definitely does have functions for the job.

cheers,
Richard


-----Original Message-----
From:	biojava-l-bounces@portal.open-bio.org on behalf of Tal Blum
Sent:	Mon 6/27/2005 12:10 AM
To:	biojava-l@biojava.org
Cc:	
Subject:	[Biojava-l] Psi-Blast results

Hi,

I need to get Psi-Blast results for a large dataset of proteins. Does anyone
know if there are any free tools or classes to do that?

Thanks, Tal

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From heuermh at acm.org  Mon Jun 27 00:36:01 2005
From: heuermh at acm.org (Michael Heuer)
Date: Mon Jun 27 00:26:23 2005
Subject: [Biojava-l] BOSC 2005 lightning talks
Message-ID: <Pine.GSO.4.44.0506270022550.10775-100000@shell3.shore.net>

Hello,

The presentations for the lightning talks on biojava and the sourceforge
stax I gave at BOSC 2005 are available (temporarily) from

> http://shore.net/~heuermh/biojava-24jun2005.ppt

and

> http://shore.net/~heuermh/stax-24jun2005.ppt

respectively.

Please let me know if I should make any corrections before making them
more widely available.  I would like to move them to the public
biojava.org and stax.sf.net project websites in a few days.

   michael

From michael.tran at acpfg.com.au  Mon Jun 27 01:19:33 2005
From: michael.tran at acpfg.com.au (Michael Tran)
Date: Mon Jun 27 01:09:01 2005
Subject: [Biojava-l] Nucleotide translation
Message-ID: <D3C986810EBCCE488A9DCA95463794F104AD30@kaneda.acpfg.local>

<mailto:michael.tran@acpfg.com.au> 
Dear Members
 
I'm a newbie to BioJava.
I'm looking for a Class file that can translate a nucleotide sequence into its 6 reading frames and then into protein.
I'm finding the BioJava API difficult to navigate.
Help is much appreciated.
 
Cheers
Kally
 
 
From rahul at genebrew.com  Mon Jun 27 01:36:04 2005
From: rahul at genebrew.com (Rahul Karnik)
Date: Mon Jun 27 01:25:42 2005
Subject: [Biojava-l] Nucleotide translation
In-Reply-To: <D3C986810EBCCE488A9DCA95463794F104AD30@kaneda.acpfg.local>
References: <D3C986810EBCCE488A9DCA95463794F104AD30@kaneda.acpfg.local>
Message-ID: <42BF9044.4080500@genebrew.com>

Michael Tran wrote:

> I'm looking for a Class file that can translate a nucleotide sequence into its 6 reading frames and then into protein.
> I'm finding the BioJava API difficult to navigate.

http://www.biojava.org/docs/bj_in_anger/Translation.htm

In general, for help with BioJava, you should first look at the BioJava 
in Anger site at http://www.biojava.org/docs/bj_in_anger/.

Hope that helps,
Rahul
From mark.schreiber at novartis.com  Mon Jun 27 01:41:46 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Mon Jun 27 01:33:08 2005
Subject: [Biojava-l] Nucleotide translation
Message-ID: <OFAEE5D316.A89B3CE9-ON4825702D.001F4168-4825702D.001F4AAA@EU.novartis.net>

To do a six frame translation have a look at:

http://www.biojava.org/docs/bj_in_anger/sixframetranslate.html


Rahul Karnik <rahul@genebrew.com>
Sent by: biojava-l-bounces@portal.open-bio.org
06/27/2005 01:36 PM

 
        To:     Michael Tran <michael.tran@acpfg.com.au>
        cc:     biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis)
        Subject:        Re: [Biojava-l] Nucleotide translation


Michael Tran wrote:

> I'm looking for a Class file that can translate a nucleotide sequence 
into its 6 reading frames and then into protein.
> I'm finding the BioJava API difficult to navigate.

http://www.biojava.org/docs/bj_in_anger/Translation.htm

In general, for help with BioJava, you should first look at the BioJava 
in Anger site at http://www.biojava.org/docs/bj_in_anger/.

Hope that helps,
Rahul
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From jesse-t at chello.nl  Tue Jun 28 04:46:09 2005
From: jesse-t at chello.nl (Jesse)
Date: Tue Jun 28 04:37:27 2005
Subject: [Biojava-l] RestrictionEnzyme can't handle double sites
Message-ID: <20050628084604.95E6B2E01D@rbox4.erasmusmc.nl>

I think a solution requires the RestritionEnzyme class to be changed.

Maybe changing getRecognitionSite() to return an array of Strings
SymbolLists instead of a single String?

-Jesse


-----------------------------------
mark.schreiber at novartis.com mark.schreiber at novartis.com 
Wed Jun 22 21:01:12 EDT 2005

What would be your reccomended solution to this problem?


"Jesse" <jesse-t at chello.nl>
Sent by: biojava-l-bounces at portal.open-bio.org
06/22/2005 11:05 PM

 
        To:     <biojava-l at biojava.org>
        cc:     (bcc: Mark Schreiber/GP/Novartis)
        Subject:        [Biojava-l] RestrictionEnzyme can't handle double
sites


Another problem.

Some Restriction Enzymes have more than one recognition site. Usually this
can be notated by using ambiguous symbols, but some for restriction 
enzymes
this is not possible because in some cases the ambiguous symbols rely on
each other.

Usually an ambiguous symbol is something like this:
ANNC
The first "N" is independent of the second "N". For example, it can match
with:
AAAC
AACC
AAGC
AATC
....
....
ATTC
16 possibilities. The ambiguous symbols are independent of each other.

But in some restriction enzyme, the ambiguous symbols are dependent of 
each
other. So for a sequence like
ANNC
Would than only match with:
AAAC
ACCC
AGGC
ATTC
Only 4 possibilities. The ambiguous symbols are dependent of each other.


This happens with these enzymes:
TaqII
M.PhiBssHII (unknown cutlocation)
M.Phi3TI (unknown cutlocation)
M.Rho11sI (unknown cutlocation)
M.SPBetaI (unknown cutlocation)
M.SPRI (unknown cutlocation)

<1>TaqII
<2>
<3>GACCGA(11/9),CACCCA(11/9)
<4>
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>X
<8>Barker, D., Hoff, M., Oliphant, A., White, R., (1984) Nucleic Acids 
Res.,
vol. 12, pp. 5567-5581.
Myers, P.A., Roberts, R.J., Unpublished observations.
Rutkowska, S.M., Jaworowska, I., Skowron, P.M., Unpublished observations.


RestrictionEnzymeManager takes the last recognition site in this example, 
it
skips GACCGA.

Name: TaqII
RecognitionSite:caccca
ForwardRegex: cac{3}a
ReverseRegex: tg{3}tg
CutType: 0
DownStreamEndType: 0
IsPalindromic: false
DownstreamCut: 17, 15,


- Jesse

From great_fred at yahoo.com  Tue Jun 28 05:11:12 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Tue Jun 28 05:02:40 2005
Subject: [Biojava-l] BLAST Parser for extracting all BLAST data?
In-Reply-To: <6D9E9B9DF347EF4385F6271C64FB8D56E562BB@BIONIC.biopolis.one-north.com>
Message-ID: <20050628091112.34256.qmail@web32201.mail.mud.yahoo.com>

Hi, everybody...

I'm like Georges....I want to extract data from BLAST files.....
I can have the alignements, no problem...But, now, I want the alignment
between the 2 sequences (the lines with "+", "-" and some letters in
George's example....) because with this, we can see in a glance if the
alignment between the 2 sequences is really good or not.

Is it possible, Docs??

Thank you.

Sebastien

--- Richard HOLLAND <hollandr@gis.a-star.edu.sg> a ?crit :

> BioJava's BLAST framework parses files and fires events for every
> piece of information it finds. The SeqSimilarityAdapter class is an
> example of how to catch these events and construct basic BLAST result
> objects (SimpleSeqSimilarityHit), however they are not comprehensive
> and do not record full details of every hit.
> 
> If you want the kind of detail you mention below you will have to
> write your own content handler for BLAST parsing and parse it to the
> BLASTLikeSAXParser when parsing a file. This event handler should
> implement the ContentHandler interface. Look at the source of
> SeqSimilarityAdapter for guidance. You will then receive events for
> every part of the file, from which you can construct your own custom
> BLAST result objects to describe them.
> 
> If you're not sure what tag names to listen for in your
> ContentHandler the easiest thing to do is just run it once and dump
> them all out to see what you get.
> 
> cheers,
> Richard
> 
> 
> -----Original Message-----
> From:	biojava-l-bounces@portal.open-bio.org on behalf of Y D Sun
> Sent:	Sun 6/26/2005 5:42 PM
> To:	biojava-l@biojava.org
> Cc:	
> Subject:	[Biojava-l] BLAST Parser for extracting all BLAST data?
> 
> Hi,
> 
> I want to extract all data from BLASTP results. In the following hit,
> for example, I need to get the lengths of query and subject proteins,
> the identities (including all data 54, 124 and 43%), the positives
> (all
> data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
> BLASTLikeSAXParser filter all these information? I can't find the
> methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs
> to
> retrieve these data. Does Biojava provide any methods for this
> purpose?
> 
> Thanks,
> 
> George
> 
> 
> BLASTP 2.2.5 [Nov-16-2002]
> 
> Query= Prot0001
>          (138 letters)
> 
> Database: /work/nys1/fasta/protein/AE000782.pro.fasta
>            2407 sequences; 662,866 total letters
> 
> Searching.....done
> 
>                                                                 
> Score
> E
> Sequences producing significant alignments:                     
> (bits)
> Value
> 
> Prot0002                                                          
> 100
> 1e-23
> Prot0003                                                           
> 74
> 2e-15
> Prot0004                                                           
> 43
> 3e-06
> 
> >Prot0002
>           Length = 138
> 
>  Score =  100 bits (250), Expect = 1e-23
>  Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124
> (2%)
> 
> Query: 18 
> NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
> 77
>            NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D
> D
> Sbjct: 15 
> NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
> 74
> 
> Query: 78 
> FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
> 134
>              K+++EL+  +    ++ + GDH IM   I K   +L EI+  + 
> ++GVKRVCP+II
> Sbjct: 75 
> LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
> 134
> 
> Query: 135 DQIK 138
>            D +K
> Sbjct: 135 DIVK 138
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From Yudong.Sun at newcastle.ac.uk  Tue Jun 28 06:13:25 2005
From: Yudong.Sun at newcastle.ac.uk (Y D Sun)
Date: Tue Jun 28 06:05:43 2005
Subject: [Biojava-l] BLAST Parser for extracting all BLAST data?
Message-ID: <E4258311FAA94940A57C29311D1165BDF547C6@largo.campus.ncl.ac.uk>

Hi,

With the example, I can extract all information I require except the
length of query sequence. Is there any "hidden" method that can report
the query length in parenthesis as (138 letters) in the sample output
below?

BTW, the addSubHitProperty() method doesn't report the Gaps data.
Fortunately, I don't need it at the moment.

Thanks,

George

>-----Original Message-----
>From: mark.schreiber@novartis.com [mailto:mark.schreiber@novartis.com] 
>Sent: 27 June 2005 03:25
>To: Y D Sun
>Subject: Re: [Biojava-l] BLAST Parser for extracting all BLAST data?
>
>Hello -
>
>Take a look at the Blast examples in biojava in anger (follow 
>the cookbook link from the biojava.org page).
>
>In particular look at
>http://www.biojava.org/docs/bj_in_anger/blastecho.htm
>
>The example program will tell you which methods are being 
>called for what information and will give you some clues as to 
>where everything ends up.
>
>- Mark
>
>
>
>
>
>"Y D Sun" <Yudong.Sun@newcastle.ac.uk>
>Sent by: biojava-l-bounces@portal.open-bio.org
>06/26/2005 05:42 PM
>
> 
>        To:     <biojava-l@biojava.org>
>        cc:     (bcc: Mark Schreiber/GP/Novartis)
>        Subject:        [Biojava-l] BLAST Parser for 
>extracting all BLAST data?
>
>
>Hi,
>
>I want to extract all data from BLASTP results. In the 
>following hit, for example, I need to get the lengths of query 
>and subject proteins, the identities (including all data 54, 
>124 and 43%), the positives (all data 79, 124 and 63%), and 
>the gaps (3, 124 and 2%). Can the BLASTLikeSAXParser filter 
>all these information? I can't find the methods in 
>SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs to 
>retrieve these data. Does Biojava provide any methods for this purpose?
>
>Thanks,
>
>George
>
>
>BLASTP 2.2.5 [Nov-16-2002]
>
>Query= Prot0001
>         (138 letters)
>
>Database: /work/nys1/fasta/protein/AE000782.pro.fasta
>           2407 sequences; 662,866 total letters
>
>Searching.....done
>
>                                                               
>  Score E
>Sequences producing significant alignments:                      (bits)
>Value
>
>Prot0002                                                           100
>1e-23
>Prot0003                                                            74
>2e-15
>Prot0004                                                            43
>3e-06
>
>>Prot0002
>          Length = 138
>
> Score =  100 bits (250), Expect = 1e-23  Identities = 54/124 
>(43%), Positives = 79/124 (63%), Gaps = 3/124 (2%)
>
>Query: 18  NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
>77
>           NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D D
>Sbjct: 15  NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
>74
>
>Query: 78  FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
>134
>             K+++EL+  +    ++ + GDH IM   I K   +L EI+  +  ++GVKRVCP+II
>Sbjct: 75  LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
>134
>
>Query: 135 DQIK 138
>           D +K
>Sbjct: 135 DIVK 138
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l@biojava.org 
>http://biojava.org/mailman/listinfo/biojava-l
>
>
>
>

From great_fred at yahoo.com  Tue Jun 28 07:34:17 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Tue Jun 28 07:26:09 2005
Subject: [Biojava-l] BLAST Parser for extracting all BLAST data?
In-Reply-To: <OF5FA9D1D9.B07D4AEF-ON4825702E.00335DB4-4825702E.00339EA4@EU.novartis.net>
Message-ID: <20050628113417.86358.qmail@web32207.mail.mud.yahoo.com>

Arggh!!!!I didn't find what I wanted!!

I used the program you gave me but with a light modification because it
didn't recognize my XML file...
The parser is, now, a BlastXMLParserFacade....
And it gave me everythings it found in the file.....
BUT not what I want!!GRRR...>:( >:( >:(

There is a mark out (I don't know if it's the good word...) in my XML
file which frame what I'm searching for : <Hsp_midline>....
Why the parser doesn't see it..??

I didn't really understand how the XML parser works....So, how can I
modifie it to find my happiness...??

PLEASE DOC'!!! ;);)
Help me!!

Thanks for everythings..

Sebastien

--- mark.schreiber@novartis.com a ?crit :

> Hi -
> 
> Try running this program 
> http://www.biojava.org/docs/bj_in_anger/blastecho.htm
> 
> If you see what you need in the output then it is being read by the
> Blast 
> parser and emitted as an event (which you could listen for). If it
> isn't 
> then the Blast parser is not emitting those events although someone 
> confident with the blast format could probably modify it so it does.
> 
> In short, it is possible but it might not be implemented ; )
> 
> - Mark
> 
> 
> 
> 
> 
> S?bastien PETIT <great_fred@yahoo.com>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/28/2005 05:11 PM
> 
>  
>         To:     biojava-l@biojava.org
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        RE: [Biojava-l] BLAST Parser for extracting
> all BLAST data?
> 
> 
> Hi, everybody...
> 
> I'm like Georges....I want to extract data from BLAST files.....
> I can have the alignements, no problem...But, now, I want the
> alignment
> between the 2 sequences (the lines with "+", "-" and some letters in
> George's example....) because with this, we can see in a glance if
> the
> alignment between the 2 sequences is really good or not.
> 
> Is it possible, Docs??
> 
> Thank you.
> 
> Sebastien
> 
> --- Richard HOLLAND <hollandr@gis.a-star.edu.sg> a ?crit :
> 
> > BioJava's BLAST framework parses files and fires events for every
> > piece of information it finds. The SeqSimilarityAdapter class is an
> > example of how to catch these events and construct basic BLAST
> result
> > objects (SimpleSeqSimilarityHit), however they are not
> comprehensive
> > and do not record full details of every hit.
> > 
> > If you want the kind of detail you mention below you will have to
> > write your own content handler for BLAST parsing and parse it to
> the
> > BLASTLikeSAXParser when parsing a file. This event handler should
> > implement the ContentHandler interface. Look at the source of
> > SeqSimilarityAdapter for guidance. You will then receive events for
> > every part of the file, from which you can construct your own
> custom
> > BLAST result objects to describe them.
> > 
> > If you're not sure what tag names to listen for in your
> > ContentHandler the easiest thing to do is just run it once and dump
> > them all out to see what you get.
> > 
> > cheers,
> > Richard
> > 
> > 
> > -----Original Message-----
> > From:          biojava-l-bounces@portal.open-bio.org on behalf of Y
> D 
> Sun
> > Sent:          Sun 6/26/2005 5:42 PM
> > To:            biojava-l@biojava.org
> > Cc: 
> > Subject:               [Biojava-l] BLAST Parser for extracting all
> BLAST 
> data?
> > 
> > Hi,
> > 
> > I want to extract all data from BLASTP results. In the following
> hit,
> > for example, I need to get the lengths of query and subject
> proteins,
> > the identities (including all data 54, 124 and 43%), the positives
> > (all
> > data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
> > BLASTLikeSAXParser filter all these information? I can't find the
> > methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit
> APIs
> > to
> > retrieve these data. Does Biojava provide any methods for this
> > purpose?
> > 
> > Thanks,
> > 
> > George
> > 
> > 
> > BLASTP 2.2.5 [Nov-16-2002]
> > 
> > Query= Prot0001
> >          (138 letters)
> > 
> > Database: /work/nys1/fasta/protein/AE000782.pro.fasta
> >            2407 sequences; 662,866 total letters
> > 
> > Searching.....done
> > 
> > 
> > Score
> > E
> > Sequences producing significant alignments: 
> > (bits)
> > Value
> > 
> > Prot0002 
> > 100
> > 1e-23
> > Prot0003 
> > 74
> > 2e-15
> > Prot0004 
> > 43
> > 3e-06
> > 
> > >Prot0002
> >           Length = 138
> > 
> >  Score =  100 bits (250), Expect = 1e-23
> >  Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124
> > (2%)
> > 
> > Query: 18 
> > NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
> > 77
> >            NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++
> G+D+D
> > D
> > Sbjct: 15 
> > NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
> > 74
> > 
> > Query: 78 
> > FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
> > 134
> >              K+++EL+  +    ++ + GDH IM   I K   +L EI+  + 
> > ++GVKRVCP+II
> > Sbjct: 75 
> > LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
> > 134
> > 
> > Query: 135 DQIK 138
> >            D +K
> > Sbjct: 135 DIVK 138
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> 
> 
>  
> 
>  
>  
>
___________________________________________________________________________
> 
> 
> Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo!
> Messenger 
> 
> T?l?chargez cette version sur http://fr.messenger.yahoo.com
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> 


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From cgarnier at ttz-Bremerhaven.de  Tue Jun 28 08:03:48 2005
From: cgarnier at ttz-Bremerhaven.de (BIBIS, Garnier, Christophe)
Date: Tue Jun 28 07:53:45 2005
Subject: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?
Message-ID: <BF52B6AA9196D71195C80030052FDA9D46588C@TTZBN>


if you don't find what you need through biojava, you can always write a
small xml parser with for example jdom.

1 - download jdom.jar
2 - use the following code to find <Hsp_midline>:
3 - replace the path of the xml file in the main method
4 - it prints out every found Element


I hope it helps you

Best,
Christophe

+++++++++++++++++++++++++++++++++++++

import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import java.util.List;

import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;

public class JDomParser
{

	private static void parseResults(Element iterations)
	{
		System.out.println("*** parseResults ***") ;
		
		Element it = iterations.getChild("Iteration") ;
		
		List elts = it.getChildren();
		
		Iterator iterator = elts.iterator();
		
		while (iterator.hasNext())
		{
			Element child = (Element) iterator.next();

			System.out.println(child + " - " + child.getText() +
" - "
					+ child.getName());

			if ( child.getName().equals("Iteration_hits"))
			{
				parseHits(child) ;
			}
			
			if ( child.getName().equals("Iteration_stat"))
			{
				parseStatistics(child) ;
			}
			
		
		}
	}

	private static void parseHits(Element element)
	{
		List elts = element.getChildren();
		
		Iterator iterator = elts.iterator();
		
		while (iterator.hasNext())
		{
			Element child = (Element) iterator.next();

			printElt(child) ;
			
			parseHit(child) ;
			
		}
	}
	
	private static void parseHspHit(Element element)
	{
		Element hsp = element.getChild("Hsp") ;

		List hsps = hsp.getChildren();
		
		Iterator iterator = hsps.iterator();
		
		while (iterator.hasNext())
		{
			Element child = (Element) iterator.next();

			printElt(child) ;
		}
	}
	
	private static void printElt(Element elt)
	{
		System.out.println("Element: [" + elt.getName() + "] -
text:" + elt.getText() ) ;
	}
	
	private static void parseHit(Element element)
	{
		List elts = element.getChildren();
		
		Iterator iterator = elts.iterator();
		
		while (iterator.hasNext())
		{
			Element child = (Element) iterator.next();

			printElt(child) ;
			
			if (child.getName().equals("Hit_hsps"))
					{
					parseHspHit(child) ;
					}
			
		}
	}
	
	
	private static void parseStatistics(Element element)
	{
		Element stat = element.getChild("Statistics") ;
		
		List elts = stat.getChildren();
		
		Iterator iterator = elts.iterator();
		
		while (iterator.hasNext())
		{
			Element child = (Element) iterator.next();

			printElt(child) ;
			
		}
		
	}
	
	
	public static void parseFile(File file) throws JDOMException,
IOException
	{
		SAXBuilder parser = new SAXBuilder();
		Document doc = parser.build(file);

		Element root = doc.getRootElement();

		List elts = root.getChildren();
		Iterator iterator = elts.iterator();

		int index = 0;
		while (iterator.hasNext())
		{

			Element child = (Element) iterator.next();

			printElt(child) ;

			if
(child.getName().equals("BlastOutput_iterations"))
				parseResults(child);

		}

	}

	/**
    * @param args
    */
	public static void main(String[] args)
	{
		File f = new File("E:/result.xml");

		try
		{
			parseFile(f);
		}
		catch (JDOMException e)
		{
			e.printStackTrace();
		}
		catch (IOException e)
		{
			e.printStackTrace();
		}
	}

}


+++++++++++++++++++++++++++++++++++++


-----Urspr?ngliche Nachricht-----
Von: S?bastien PETIT [mailto:great_fred@yahoo.com]
Gesendet: Dienstag, 28. Juni 2005 13:34
An: biojava-l@biojava.org
Betreff: RE: [Biojava-l] BLAST Parser for extracting all BLAST data?


Arggh!!!!I didn't find what I wanted!!

I used the program you gave me but with a light modification because it
didn't recognize my XML file...
The parser is, now, a BlastXMLParserFacade....
And it gave me everythings it found in the file.....
BUT not what I want!!GRRR...>:( >:( >:(

There is a mark out (I don't know if it's the good word...) in my XML
file which frame what I'm searching for : <Hsp_midline>....
Why the parser doesn't see it..??

I didn't really understand how the XML parser works....So, how can I
modifie it to find my happiness...??

PLEASE DOC'!!! ;);)
Help me!!

Thanks for everythings..

Sebastien

--- mark.schreiber@novartis.com a ?crit :

> Hi -
> 
> Try running this program 
> http://www.biojava.org/docs/bj_in_anger/blastecho.htm
> 
> If you see what you need in the output then it is being read by the
> Blast 
> parser and emitted as an event (which you could listen for). If it
> isn't 
> then the Blast parser is not emitting those events although someone 
> confident with the blast format could probably modify it so it does.
> 
> In short, it is possible but it might not be implemented ; )
> 
> - Mark
> 
> 
> 
> 
> 
> S?bastien PETIT <great_fred@yahoo.com>
> Sent by: biojava-l-bounces@portal.open-bio.org
> 06/28/2005 05:11 PM
> 
>  
>         To:     biojava-l@biojava.org
>         cc:     (bcc: Mark Schreiber/GP/Novartis)
>         Subject:        RE: [Biojava-l] BLAST Parser for extracting
> all BLAST data?
> 
> 
> Hi, everybody...
> 
> I'm like Georges....I want to extract data from BLAST files.....
> I can have the alignements, no problem...But, now, I want the
> alignment
> between the 2 sequences (the lines with "+", "-" and some letters in
> George's example....) because with this, we can see in a glance if
> the
> alignment between the 2 sequences is really good or not.
> 
> Is it possible, Docs??
> 
> Thank you.
> 
> Sebastien
> 
> --- Richard HOLLAND <hollandr@gis.a-star.edu.sg> a ?crit :
> 
> > BioJava's BLAST framework parses files and fires events for every
> > piece of information it finds. The SeqSimilarityAdapter class is an
> > example of how to catch these events and construct basic BLAST
> result
> > objects (SimpleSeqSimilarityHit), however they are not
> comprehensive
> > and do not record full details of every hit.
> > 
> > If you want the kind of detail you mention below you will have to
> > write your own content handler for BLAST parsing and parse it to
> the
> > BLASTLikeSAXParser when parsing a file. This event handler should
> > implement the ContentHandler interface. Look at the source of
> > SeqSimilarityAdapter for guidance. You will then receive events for
> > every part of the file, from which you can construct your own
> custom
> > BLAST result objects to describe them.
> > 
> > If you're not sure what tag names to listen for in your
> > ContentHandler the easiest thing to do is just run it once and dump
> > them all out to see what you get.
> > 
> > cheers,
> > Richard
> > 
> > 
> > -----Original Message-----
> > From:          biojava-l-bounces@portal.open-bio.org on behalf of Y
> D 
> Sun
> > Sent:          Sun 6/26/2005 5:42 PM
> > To:            biojava-l@biojava.org
> > Cc: 
> > Subject:               [Biojava-l] BLAST Parser for extracting all
> BLAST 
> data?
> > 
> > Hi,
> > 
> > I want to extract all data from BLASTP results. In the following
> hit,
> > for example, I need to get the lengths of query and subject
> proteins,
> > the identities (including all data 54, 124 and 43%), the positives
> > (all
> > data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
> > BLASTLikeSAXParser filter all these information? I can't find the
> > methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit
> APIs
> > to
> > retrieve these data. Does Biojava provide any methods for this
> > purpose?
> > 
> > Thanks,
> > 
> > George
> > 
> > 
> > BLASTP 2.2.5 [Nov-16-2002]
> > 
> > Query= Prot0001
> >          (138 letters)
> > 
> > Database: /work/nys1/fasta/protein/AE000782.pro.fasta
> >            2407 sequences; 662,866 total letters
> > 
> > Searching.....done
> > 
> > 
> > Score
> > E
> > Sequences producing significant alignments: 
> > (bits)
> > Value
> > 
> > Prot0002 
> > 100
> > 1e-23
> > Prot0003 
> > 74
> > 2e-15
> > Prot0004 
> > 43
> > 3e-06
> > 
> > >Prot0002
> >           Length = 138
> > 
> >  Score =  100 bits (250), Expect = 1e-23
> >  Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124
> > (2%)
> > 
> > Query: 18 
> > NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
> > 77
> >            NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++
> G+D+D
> > D
> > Sbjct: 15 
> > NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
> > 74
> > 
> > Query: 78 
> > FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
> > 134
> >              K+++EL+  +    ++ + GDH IM   I K   +L EI+  + 
> > ++GVKRVCP+II
> > Sbjct: 75 
> > LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
> > 134
> > 
> > Query: 135 DQIK 138
> >            D +K
> > Sbjct: 135 DIVK 138
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> 
> 
>  
> 
>  
>  
>
___________________________________________________________________________
> 
> 
> Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo!
> Messenger 
> 
> T?l?chargez cette version sur http://fr.messenger.yahoo.com
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> 
> 
> 


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

From great_fred at yahoo.com  Tue Jun 28 08:59:30 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Tue Jun 28 08:50:52 2005
Subject: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?
In-Reply-To: <BF52B6AA9196D71195C80030052FDA9D46588C@TTZBN>
Message-ID: <20050628125931.9771.qmail@web32209.mail.mud.yahoo.com>

Thank you for JDOM and the code...
But, it generates a ton of exceptions and error because it doesn't find
a DTD file (NCBI_BlastOutput.dtd) that I don't have...

So, I don't know how to do...

Sebastien

--- "BIBIS, Garnier, Christophe" <cgarnier@ttz-Bremerhaven.de> a ?crit
:

> 
> if you don't find what you need through biojava, you can always write
> a
> small xml parser with for example jdom.
> 
> 1 - download jdom.jar
> 2 - use the following code to find <Hsp_midline>:
> 3 - replace the path of the xml file in the main method
> 4 - it prints out every found Element
> 
> 
> I hope it helps you
> 
> Best,
> Christophe
> 
> +++++++++++++++++++++++++++++++++++++
> 
> import java.io.File;
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> 
> import org.jdom.Document;
> import org.jdom.Element;
> import org.jdom.JDOMException;
> import org.jdom.input.SAXBuilder;
> 
> public class JDomParser
> {
> 
> 	private static void parseResults(Element iterations)
> 	{
> 		System.out.println("*** parseResults ***") ;
> 		
> 		Element it = iterations.getChild("Iteration") ;
> 		
> 		List elts = it.getChildren();
> 		
> 		Iterator iterator = elts.iterator();
> 		
> 		while (iterator.hasNext())
> 		{
> 			Element child = (Element) iterator.next();
> 
> 			System.out.println(child + " - " + child.getText() +
> " - "
> 					+ child.getName());
> 
> 			if ( child.getName().equals("Iteration_hits"))
> 			{
> 				parseHits(child) ;
> 			}
> 			
> 			if ( child.getName().equals("Iteration_stat"))
> 			{
> 				parseStatistics(child) ;
> 			}
> 			
> 		
> 		}
> 	}
> 
> 	private static void parseHits(Element element)
> 	{
> 		List elts = element.getChildren();
> 		
> 		Iterator iterator = elts.iterator();
> 		
> 		while (iterator.hasNext())
> 		{
> 			Element child = (Element) iterator.next();
> 
> 			printElt(child) ;
> 			
> 			parseHit(child) ;
> 			
> 		}
> 	}
> 	
> 	private static void parseHspHit(Element element)
> 	{
> 		Element hsp = element.getChild("Hsp") ;
> 
> 		List hsps = hsp.getChildren();
> 		
> 		Iterator iterator = hsps.iterator();
> 		
> 		while (iterator.hasNext())
> 		{
> 			Element child = (Element) iterator.next();
> 
> 			printElt(child) ;
> 		}
> 	}
> 	
> 	private static void printElt(Element elt)
> 	{
> 		System.out.println("Element: [" + elt.getName() + "] -
> text:" + elt.getText() ) ;
> 	}
> 	
> 	private static void parseHit(Element element)
> 	{
> 		List elts = element.getChildren();
> 		
> 		Iterator iterator = elts.iterator();
> 		
> 		while (iterator.hasNext())
> 		{
> 			Element child = (Element) iterator.next();
> 
> 			printElt(child) ;
> 			
> 			if (child.getName().equals("Hit_hsps"))
> 					{
> 					parseHspHit(child) ;
> 					}
> 			
> 		}
> 	}
> 	
> 	
> 	private static void parseStatistics(Element element)
> 	{
> 		Element stat = element.getChild("Statistics") ;
> 		
> 		List elts = stat.getChildren();
> 		
> 		Iterator iterator = elts.iterator();
> 		
> 		while (iterator.hasNext())
> 		{
> 			Element child = (Element) iterator.next();
> 
> 			printElt(child) ;
> 			
> 		}
> 		
> 	}
> 	
> 	
> 	public static void parseFile(File file) throws JDOMException,
> IOException
> 	{
> 		SAXBuilder parser = new SAXBuilder();
> 		Document doc = parser.build(file);
> 
> 		Element root = doc.getRootElement();
> 
> 		List elts = root.getChildren();
> 		Iterator iterator = elts.iterator();
> 
> 		int index = 0;
> 		while (iterator.hasNext())
> 		{
> 
> 			Element child = (Element) iterator.next();
> 
> 			printElt(child) ;
> 
> 			if
> (child.getName().equals("BlastOutput_iterations"))
> 				parseResults(child);
> 
> 		}
> 
> 	}
> 
> 	/**
>     * @param args
>     */
> 	public static void main(String[] args)
> 	{
> 		File f = new File("E:/result.xml");
> 
> 		try
> 		{
> 			parseFile(f);
> 		}
> 		catch (JDOMException e)
> 		{
> 			e.printStackTrace();
> 		}
> 		catch (IOException e)
> 		{
> 			e.printStackTrace();
> 		}
> 	}
> 
> }
> 
> 
> 
> 
> 
> +++++++++++++++++++++++++++++++++++++
> 
> 
> 
> 
> 
=== message truncated ===


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From great_fred at yahoo.com  Tue Jun 28 10:49:34 2005
From: great_fred at yahoo.com (=?iso-8859-1?q?S=E9bastien=20PETIT?=)
Date: Tue Jun 28 10:40:49 2005
Subject: AW: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?
In-Reply-To: <BF52B6AA9196D71195C80030052FDA9D46588D@TTZBN>
Message-ID: <20050628144934.18013.qmail@web32205.mail.mud.yahoo.com>

I try the code you sent me. I just change the path of the XML file.
But, in this file, there is this line :

<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
"NCBI_BlastOutput.dtd">

and I have exceptions and errors because of this line.

If you want, I send the XML file so that you test it...

But, I download the DTD and the MOD files necessary, I modified the DTD
file a little bit, and it works...
But, I would prefer to not have those files with my code...

Thank you...

Sebastien

--- "BIBIS, Garnier, Christophe" <cgarnier@ttz-Bremerhaven.de> a ?crit
:

> Did you try just the code i sent you? Or did you integrate it inside
> your
> program?
> 
> As far as i know, jdom works without dtd files: it makes no control
> on the
> structure of the file
> It should word because I tested it without using the corresponding
> dtd file.
> 
> 
> christophe
> 
> 
> -----Urspr?ngliche Nachricht-----
> Von: S?bastien PETIT [mailto:great_fred@yahoo.com]
> Gesendet: Dienstag, 28. Juni 2005 15:00
> An: biojava-l@biojava.org
> Betreff: RE: AW: [Biojava-l] BLAST Parser for extracting all BLAST
> data?
> 
> 
> Thank you for JDOM and the code...
> But, it generates a ton of exceptions and error because it doesn't
> find
> a DTD file (NCBI_BlastOutput.dtd) that I don't have...
> 
> So, I don't know how to do...
> 
> Sebastien
> 
> --- "BIBIS, Garnier, Christophe" <cgarnier@ttz-Bremerhaven.de> a
> ?crit
> :
> 
> > 
> > if you don't find what you need through biojava, you can always
> write
> > a
> > small xml parser with for example jdom.
> > 
> > 1 - download jdom.jar
> > 2 - use the following code to find <Hsp_midline>:
> > 3 - replace the path of the xml file in the main method
> > 4 - it prints out every found Element
> > 
> > 
> > I hope it helps you
> > 
> > Best,
> > Christophe
> > 
> > +++++++++++++++++++++++++++++++++++++
> > 
> > import java.io.File;
> > import java.io.IOException;
> > import java.util.Iterator;
> > import java.util.List;
> > 
> > import org.jdom.Document;
> > import org.jdom.Element;
> > import org.jdom.JDOMException;
> > import org.jdom.input.SAXBuilder;
> > 
> > public class JDomParser
> > {
> > 
> > 	private static void parseResults(Element iterations)
> > 	{
> > 		System.out.println("*** parseResults ***") ;
> > 		
> > 		Element it = iterations.getChild("Iteration") ;
> > 		
> > 		List elts = it.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			System.out.println(child + " - " + child.getText() +
> > " - "
> > 					+ child.getName());
> > 
> > 			if ( child.getName().equals("Iteration_hits"))
> > 			{
> > 				parseHits(child) ;
> > 			}
> > 			
> > 			if ( child.getName().equals("Iteration_stat"))
> > 			{
> > 				parseStatistics(child) ;
> > 			}
> > 			
> > 		
> > 		}
> > 	}
> > 
> > 	private static void parseHits(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			parseHit(child) ;
> > 			
> > 		}
> > 	}
> > 	
> > 	private static void parseHspHit(Element element)
> > 	{
> > 		Element hsp = element.getChild("Hsp") ;
> > 
> > 		List hsps = hsp.getChildren();
> > 		
> > 		Iterator iterator = hsps.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 		}
> > 	}
> > 	
> > 	private static void printElt(Element elt)
> > 	{
> > 		System.out.println("Element: [" + elt.getName() + "] -
> > text:" + elt.getText() ) ;
> > 	}
> > 	
> > 	private static void parseHit(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			if (child.getName().equals("Hit_hsps"))
> > 					{
> > 					parseHspHit(child) ;
> > 					}
> > 			
> > 		}
> > 	}
> > 	
> > 	
> > 	private static void parseStatistics(Element element)
> > 	{
> > 		Element stat = element.getChild("Statistics") ;
> > 		
> > 		List elts = stat.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 		}
> > 		
> > 	}
> > 	
> > 	
> > 	public static void parseFile(File file) throws JDOMException,
> > IOException
> > 	{
> > 		SAXBuilder parser = new SAXBuilder();
> > 		Document doc = parser.build(file);
> > 
> > 		Element root = doc.getRootElement();
> > 
> > 		List elts = root.getChildren();
> > 		Iterator iterator = elts.iterator();
> > 
> > 		int index = 0;
> > 		while (iterator.hasNext())
> > 		{
> > 
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 
> > 			if
> > (child.getName().equals("BlastOutput_iterations"))
> > 				parseResults(child);
> > 
> 
=== message truncated ===


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
From gilson at cs.wisc.edu  Tue Jun 28 14:51:48 2005
From: gilson at cs.wisc.edu (Michael C Gilson)
Date: Tue Jun 28 14:43:06 2005
Subject: [Biojava-l] Implementing a Feature
Message-ID: <35E633ED-EE51-49BB-9C37-F778506CE5AD@cs.wisc.edu>

Hello, all.  I am new to BioJava but finding it extremely useful.   
I'd like to add a few extra fields or methods to the SimpleFeature  
class and I'm wondering the best way to go about it?  I have read  
through BioJava in Anger and also am wondering if there are any other  
documents out there that describe how to work with the API (beyond  
just the API javadocs).

Thanks in advance,
Michael C Gilson
Genome Evolution Lab
University of Wisconsin-Madison
From hollandr at gis.a-star.edu.sg  Tue Jun 28 15:03:28 2005
From: hollandr at gis.a-star.edu.sg (Richard HOLLAND)
Date: Tue Jun 28 14:55:47 2005
Subject: [Biojava-l] Implementing a Feature
Message-ID: <6D9E9B9DF347EF4385F6271C64FB8D5601E87174@BIONIC.biopolis.one-north.com>

The best thing to do is write your own class which extends SimpleFeature, or ignores SimpleFeature and just implements the Feature interface with all-new methods written from scratch, plus your new ones.

If the extra methods are pretty generic, you could create a new interface which extends Feature and add them there, then make your new class implement the new interface.

Why not list out the methods/fields you'd like to add - then we can all offer suggestions as to the most appropriate places to make the changes.

The BioJava in Anger book has a few links to how the API works - it's linked from biojava.org under the documentation section.

cheer,
Richard


-----Original Message-----
From:	biojava-l-bounces@portal.open-bio.org on behalf of Michael C Gilson
Sent:	Wed 6/29/2005 2:51 AM
To:	biojava-l@biojava.org
Cc:	
Subject:	[Biojava-l] Implementing a Feature

Hello, all.  I am new to BioJava but finding it extremely useful.   
I'd like to add a few extra fields or methods to the SimpleFeature  
class and I'm wondering the best way to go about it?  I have read  
through BioJava in Anger and also am wondering if there are any other  
documents out there that describe how to work with the API (beyond  
just the API javadocs).

Thanks in advance,
Michael C Gilson
Genome Evolution Lab
University of Wisconsin-Madison
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


From Russell.Smithies at agresearch.co.nz  Tue Jun 28 20:11:23 2005
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue Jun 28 20:02:52 2005
Subject: AW: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?
Message-ID: <D5DBA313349A4B458528BE63B387F36C936B11@imail.agresearch.co.nz>

Easiest method (if you don't care about validating) is delete the DTD line in the XML.
If you do need to validate, ensure you have your proxy settings stet correctly so the parser can access the DTD.

Russell 

-----Original Message-----
From: biojava-l-bounces@portal.open-bio.org [mailto:biojava-l-bounces@portal.open-bio.org] On Behalf Of S?bastien PETIT
Sent: Wednesday, 29 June 2005 2:50 a.m.
To: biojava-l@biojava.org
Subject: RE: AW: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?

I try the code you sent me. I just change the path of the XML file.
But, in this file, there is this line :

<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
"NCBI_BlastOutput.dtd">

and I have exceptions and errors because of this line.

If you want, I send the XML file so that you test it...

But, I download the DTD and the MOD files necessary, I modified the DTD
file a little bit, and it works...
But, I would prefer to not have those files with my code...

Thank you...

Sebastien

--- "BIBIS, Garnier, Christophe" <cgarnier@ttz-Bremerhaven.de> a ?crit
:

> Did you try just the code i sent you? Or did you integrate it inside
> your
> program?
> 
> As far as i know, jdom works without dtd files: it makes no control
> on the
> structure of the file
> It should word because I tested it without using the corresponding
> dtd file.
> 
> 
> christophe
> 
> 
> -----Urspr?ngliche Nachricht-----
> Von: S?bastien PETIT [mailto:great_fred@yahoo.com]
> Gesendet: Dienstag, 28. Juni 2005 15:00
> An: biojava-l@biojava.org
> Betreff: RE: AW: [Biojava-l] BLAST Parser for extracting all BLAST
> data?
> 
> 
> Thank you for JDOM and the code...
> But, it generates a ton of exceptions and error because it doesn't
> find
> a DTD file (NCBI_BlastOutput.dtd) that I don't have...
> 
> So, I don't know how to do...
> 
> Sebastien
> 
> --- "BIBIS, Garnier, Christophe" <cgarnier@ttz-Bremerhaven.de> a
> ?crit
> :
> 
> > 
> > if you don't find what you need through biojava, you can always
> write
> > a
> > small xml parser with for example jdom.
> > 
> > 1 - download jdom.jar
> > 2 - use the following code to find <Hsp_midline>:
> > 3 - replace the path of the xml file in the main method
> > 4 - it prints out every found Element
> > 
> > 
> > I hope it helps you
> > 
> > Best,
> > Christophe
> > 
> > +++++++++++++++++++++++++++++++++++++
> > 
> > import java.io.File;
> > import java.io.IOException;
> > import java.util.Iterator;
> > import java.util.List;
> > 
> > import org.jdom.Document;
> > import org.jdom.Element;
> > import org.jdom.JDOMException;
> > import org.jdom.input.SAXBuilder;
> > 
> > public class JDomParser
> > {
> > 
> > 	private static void parseResults(Element iterations)
> > 	{
> > 		System.out.println("*** parseResults ***") ;
> > 		
> > 		Element it = iterations.getChild("Iteration") ;
> > 		
> > 		List elts = it.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			System.out.println(child + " - " + child.getText() +
> > " - "
> > 					+ child.getName());
> > 
> > 			if ( child.getName().equals("Iteration_hits"))
> > 			{
> > 				parseHits(child) ;
> > 			}
> > 			
> > 			if ( child.getName().equals("Iteration_stat"))
> > 			{
> > 				parseStatistics(child) ;
> > 			}
> > 			
> > 		
> > 		}
> > 	}
> > 
> > 	private static void parseHits(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			parseHit(child) ;
> > 			
> > 		}
> > 	}
> > 	
> > 	private static void parseHspHit(Element element)
> > 	{
> > 		Element hsp = element.getChild("Hsp") ;
> > 
> > 		List hsps = hsp.getChildren();
> > 		
> > 		Iterator iterator = hsps.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 		}
> > 	}
> > 	
> > 	private static void printElt(Element elt)
> > 	{
> > 		System.out.println("Element: [" + elt.getName() + "] -
> > text:" + elt.getText() ) ;
> > 	}
> > 	
> > 	private static void parseHit(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			if (child.getName().equals("Hit_hsps"))
> > 					{
> > 					parseHspHit(child) ;
> > 					}
> > 			
> > 		}
> > 	}
> > 	
> > 	
> > 	private static void parseStatistics(Element element)
> > 	{
> > 		Element stat = element.getChild("Statistics") ;
> > 		
> > 		List elts = stat.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 		}
> > 		
> > 	}
> > 	
> > 	
> > 	public static void parseFile(File file) throws JDOMException,
> > IOException
> > 	{
> > 		SAXBuilder parser = new SAXBuilder();
> > 		Document doc = parser.build(file);
> > 
> > 		Element root = doc.getRootElement();
> > 
> > 		List elts = root.getChildren();
> > 		Iterator iterator = elts.iterator();
> > 
> > 		int index = 0;
> > 		while (iterator.hasNext())
> > 		{
> > 
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 
> > 			if
> > (child.getName().equals("BlastOutput_iterations"))
> > 				parseResults(child);
> > 
> 
=== message truncated ===


___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T?l?chargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================

From mark.schreiber at novartis.com  Tue Jun 28 21:49:19 2005
From: mark.schreiber at novartis.com (mark.schreiber@novartis.com)
Date: Tue Jun 28 21:40:36 2005
Subject: [Biojava-l] BLAST functionality
Message-ID: <OF0F2EE59A.40F82533-ON4825702F.00095CA1-4825702F.000A02B3@EU.novartis.net>

There have been a number of requests to the list (and directly to me) for 
increased functionality for the BLAST parsers (eg capturing more of the 
information in the report). Originally the design was lightweight and 
captured what most people wanted but as always there are always people who 
think different (as Steve Jobs might say) and want different things.

The best way for the BLAST parsers to improve is for people to contribute 
code. There are lots of work arounds that people have made to improve the 
parsers that have not found there way into biojava. Ideally I'm hoping 
someone will volunteer to take a look at this and coordinate the effort. 
The ideal person should be a reasonable Java programmer with  a good feel 
for how the BLAST part of the API works. They would also be someone who 
uses it a lot and is therefore motivated to improve it. The BLAST API is 
probably the most used part of biojava so instant fame and adulation await 
the generous volunteer : )

I know your out there somewhere...

- Mark

Mark Schreiber
Principal Scientist (Bioinformatics)

Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com

phone +65 6722 2973
fax  +65 6722 2910

From Gem.Yang at jhu.edu  Thu Jun 30 14:29:54 2005
From: Gem.Yang at jhu.edu (Gem Yang)
Date: Thu Jun 30 14:21:55 2005
Subject: [Biojava-l] memory leak while reading nr.fasta
In-Reply-To: <mailman.0.1120155070.18805.biojava-l@biojava.org>
Message-ID: <200506301830.j5UIUGe00711@storey.bme.jhu.edu>

Hi,

I am new to Biojava.  
I have the following program, which is copied from ReadFaster2 in the
cookbook.

public static void main(String[] args) {
	try {
		// args[0] is nr.fasta
	  BufferedReader br = new BufferedReader(new FileReader(args[0]));

	  String format = "FASTA";
	  String alphabet = "PROTEIN";

	  SequenceIterator iter =
quenceIterator)SeqIOTools.fileToBiojava(format,alphabet, br);

	  int count =0; 
	  long start = System.currentTimeMillis();
	  while(iter.hasNext())
	  {
	  		Sequence s = iter.nextSequence();
	  		String name = s.getName();
	  		
	  		//System.out.println(name);
	  		s.getAnnotation();
	  		//System.out.println(s.seqString());
	  		count ++;
	  		System.out.println(count);
	  		
	  }
	  long end = System.currentTimeMillis();
	  System.out.println("number of sequence " + count);
	  System.out.println("time used" + (end-start)/1000 + "seconds");
	  System.out.println((end-start)/1000/60 + "minutes");
	}
	catch (FileNotFoundException ex) {
	  //can't find file specified by args[0]
	  ex.printStackTrace();
	}catch (BioException ex) {
	  //error parsing requested format
	  ex.printStackTrace();
	}
  }

When running this code, I got out of memory error in about half an hour and
1.5GB memory allocated.  My workstation is a Windows XP with 2 GB of memory.
My biojava version is 1.3. My JRE is one came with Websphere application
developer.

Thanks.
Gem
From Gem.Yang at jhu.edu  Thu Jun 30 14:31:49 2005
From: Gem.Yang at jhu.edu (Gem Yang)
Date: Thu Jun 30 14:23:21 2005
Subject: [Biojava-l] RE: memory leak while reading nr.fasta
Message-ID: <200506301832.j5UIWAe00860@storey.bme.jhu.edu>

I have just a couple of typos in my previous post.

Sorry about that.

Gem

-----Original Message-----
From: Gem Yang [mailto:cyang27@bme.jhu.edu] 
Sent: Thursday, June 30, 2005 2:30 PM
To: 'biojava-l@biojava.org'
Subject: memory leak while reading nr.fasta

Hi,

I am new to Biojava.  
I have the following program, which is copied from ReadFaster2 in the
cookbook.

public static void main(String[] args) {
	try {
		// args[0] is nr.fasta
	  BufferedReader br = new BufferedReader(new FileReader(args[0]));

	  String format = "FASTA";
	  String alphabet = "PROTEIN";

	  SequenceIterator iter =
quenceIterator)SeqIOTools.fileToBiojava(format,alphabet, br);

	  int count =0; 
	  long start = System.currentTimeMillis();
	  while(iter.hasNext())
	  {
	  		Sequence s = iter.nextSequence();
	  		String name = s.getName();
	  		
	  		//System.out.println(name);
	  		s.getAnnotation();
	  		//System.out.println(s.seqString());
	  		count ++;
	  		System.out.println(count);
	  		
	  }
	  long end = System.currentTimeMillis();
	  System.out.println("number of sequence " + count);
	  System.out.println("time used" + (end-start)/1000 + "seconds");
	  System.out.println((end-start)/1000/60 + "minutes");
	}
	catch (FileNotFoundException ex) {
	  //can't find file specified by args[0]
	  ex.printStackTrace();
	}catch (BioException ex) {
	  //error parsing requested format
	  ex.printStackTrace();
	}
  }

When running this code, I got out of memory error in about half an hour and
1.5GB memory allocated.  My workstation is a Windows XP with 2 GB of memory.
My biojava version is 1.3. My JRE is one came with Websphere application
developer.

Thanks.
Gem