[Biojava-l] BlastLikeSearchBuilder and queryIDs
Keith James
kdj at sanger.ac.uk
Tue May 13 10:37:38 EDT 2003
>>>>> "Frank" == Frank Vernaillen <fr_ve at hotmail.com> writes:
Frank> Hello! I'm planning to use BioJava for parsing some
Frank> relatively large (multi-megabyte, flat-file) Blast result
Frank> files. My idea was to parse the data somewhat along the
Frank> lines of http://bioconf.otago.ac.nz/biojava/BlastParser.htm
Frank> and
Frank> http://bioconf.otago.ac.nz/biojava/ExtractSearchInformation.htm.
Frank> This way I end up with a Vector of
Frank> SeqSimilaritySearchResults. The SeqSimilaritySearchResult
Frank> interface offers a getQuerySequence() method, but it
Frank> returns a SymbolList, not a Sequence. Now this is a
Frank> problem, because I can't seem to obtain the *queryID* of
Frank> the sequence anymore, only the sequence symbols
Frank> themselves. Was this a deliberate design choice?
To be honest, I don't know because the original interface design
predates my involvement. I've generally been (over) cautious about
changing these interfaces - however, I can see this is an
issue. Perhaps I can squeeze in this change before the release?
i.e. change getQuerySequence to actually return a Sequence (as the
name suggests) rather than a SymbolList. (ASAP - this evening?)
Recently the org.biojava.bio.search interfaces have been made
Annotatable and the Annotation object associated with each Result, Hit
and SubHit used to capture all data sent by the SAX parser to the
result builder. These are all stored as String key-value pairs. I just
need to document what pairs are available for Blast. I'll check this
in with the above change if nobody objects.
Keith
--
- Keith James <kdj at sanger.ac.uk> bioinformatics programming support -
- Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, UK -
More information about the Biojava-l
mailing list