[Biojava-l] Different implementation of Sequence?
Y D Sun
Yudong.Sun at newcastle.ac.uk
Mon Jun 9 14:23:53 EDT 2003
Your results are very fast. I am now using the latest versions including
biojava-1.3pre4-jdk14.jar, etc. and PostgreSQL 7.3.3 on Linux 2.4.20.
It needs 30 seconds to run your filter on my system.
Cheers,
George
> -----Original Message-----
> From: Simon Foote [mailto:simon.foote at nrc-cnrc.gc.ca]
> Sent: 09 June 2003 13:02
> To: Y D Sun
> Cc: biojava-l at biojava.org
> Subject: Re: [Biojava-l] Different implementation of Sequence?
>
>
> Hi George,
>
> Here's the result of tests with my database searcher webapp against
> several genomes individually. Unfortunately, it's not a pure
> CDS search
> as I recently modified the app to also return the DNA
> sequence for the
> feature, so that adds additional time depending upon the size
> of the genome.
>
> The search gene is dnaA, the search filter consists of:
>
> FeatureFilter ff1 = new FeatureFilter.ByType("CDS");
> FeatureFilter ff2 = new
> FeatureFilter.AnnotationContains("gene", geneId);
> FeatureFilter ff3 = new
> FeatureFilter.AnnotationContains("ibs_id", geneId);
> FeatureFilter ff5 = new
> FeatureFilter.AnnotationContains("gene_id", geneId);
> FeatureFilter ff4 = new FeatureFilter.Or(ff2, ff3);
> FeatureFilter ff6 = new FeatureFilter.Or(ff4, ff5);
> FeatureFilter ff7 = new FeatureFilter.And(ff1, ff6);
> FeatureHolder fh = seq.filter(ff7, false);
>
> Bacteria Search Time
> E.coli K12 6 seconds
> H. influenzae 4 seconds
> C. jejuni 6 seconds
> H. pylori J99 6 seconds
>
> Note: Which version of the biosql schema are you using and which
> version of biojava
>
> Regards,
> Simon
>
> Y D Sun wrote:
>
> >
> >
> >>-----Original Message-----
> >>From: Simon Foote [mailto:simon.foote at nrc-cnrc.gc.ca]
> >>Sent: 05 June 2003 12:59
> >>To: Y D Sun
> >>Cc: biojava-l at biojava.org
> >>Subject: Re: [Biojava-l] Different implementation of Sequence?
> >>
> >>
> >>Just to add my 2 cents worth.
> >>
> >>I'm using the latest version of the BioSQL schema within
> >>MySQL and the
> >>filters are quite fast. On a database containing 18 complete
> >>bacterial
> >>genomes, fetching a given gene by name which uses a
> combination of 5
> >>filters in my case, takes approx. 1-2 seconds.
> >>
> >>
> >>
> >
> >Simon,
> >
> >Have you tried to filter all CDS sections of a complete bacterial
> >genome? In my experience with PostgreSQL, it takes only a
> few seconds
> >to filter a simple feature. However, it needs more than one
> minute to
> >filter thousands of CDS's in a genome.
> >
> >George
> >
> >
>
> --
> Bioinformatics Specialist
> Institute for Biological Sciences
> National Research Council of Canada
> [T] 613-990-0561 [F] 613-952-9092
> simon.foote at nrc-cnrc.gc.ca
>
>
>
More information about the Biojava-l
mailing list