[Biojava-l] sequence dbs
Jason Stajich
jason@chg.mc.duke.edu
Tue, 15 May 2001 17:44:40 -0400 (EDT)
On Wed, 16 May 2001, Schreiber, Mark wrote:
> Hi -
>
> I very much favour the idea of having a remote SequenceDB rather than
> breaking the substantial amount of code that uses SequenceDB. I use
> SequenceDB all the time in my programs so I guess I am keen to not have to
> recode it all.
>
Agreed. Would really not want to do that. I think we can make this work
with current interface + RemoteSequenceDB throwing appropriate exceptions
it just won't be quite as clean as the OO junkies may like...
> As for parsing an unknown sequence type a simple (and innefficient way to do
> it would be to read the record once as a text (or XML) file to determine the
> correct alphabet then parse it for real. Don't know if this can be done
> dynamically with the current biojava parsers. Maybe parsers based on a SAX
> event model would be the way to go??
>
> Mark
>
> Mark Schreiber
> Bioinformatics
> AgResearch Invermay
> PO Box 50034
> Mosgiel
> New Zealand
>
> PH: +64 3 489 9175
>
>
>
> > -----Original Message-----
> > From: Jason Stajich [mailto:jason@chg.mc.duke.edu]
> > Sent: Wednesday, May 16, 2001 9:19 AM
> > To: BioJava List
> > Subject: [Biojava-l] sequence dbs
> >
> >
> > I started to work on this at biojava bootcamp, didn't get
> > very far because
> > of the following:
> > seq.db.SequenceDB currently have the following methods that one cannot
> > implement for 'remote' databases.
> >
> > < Set ids();
> > < SequenceIterator sequenceIterator();
> >
> > < void addSequence(Sequence seq)
> > < throws IllegalIDException, BioException, ChangeVetoException;
> > < void removeSequence(String id)
> > < throws IllegalIDException, BioException, ChangeVetoException;
> >
> > I started to split these methods into separate interfaces -
> > LocalSequenceDB for the ids() and seuenceIterator and
> > UpdateableSequenceDB
> > for add/remove. This of course breaks all classes which depend on
> > SequenceDB. The other option is to create RemoteSequenceDB
> > which throws
> > VetoExceptions for add/remove calls and some other exception for
> > ids()/sequenceIterator().
> >
> > BTW: An example of a RemoteDB is web EMBL queries which we will patch
> > through HTTP to extract a sequence from this database (will
> > be talking to
> > Heikki's web script). Similarly if the GenBank parsing works
> > we can pass
> > queries to NCBI GenBank to query on an accession number.
> >
> > One other major issue is: what if we do not know what type of
> > sequence we
> > are obtaining (prot or [dr]na)? Biojava likes to have these things
> > established in the parser - but I won't really be able to
> > divine anything
> > from an accession number. ideas?
> >
> > -jason
> >
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > Center for Human Genetics
> > Duke University Medical Center
> > http://www.chg.duke.edu/
> >
> >
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
>
Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center
http://www.chg.duke.edu/