[Bioperl-l] Bio::*Taxonomy* changes
Chris Fields
cjfields at uiuc.edu
Tue Jul 25 01:53:33 EDT 2006
So do we intend on having everyone who installs bioperl have a local
copy of the taxonomy dumpfile? Or perform a remote lookup via
Entrez? Seems a bit extreme.
I would like the option of not having the lookup run; as I mentioned
to Sendu, one of the biggest complaints about bioperl is speed.
Additional lookups won't help on that end.
Chris
On Jul 24, 2006, at 10:31 PM, Hilmar Lapp wrote:
>
> On Jul 24, 2006, at 10:29 PM, Chris Fields wrote:
>
>> [...]
>> We could go back and forth on what Jason really intended. [...] The
>> reality is he's not here and you're willing to do the job.
>
> Right. And, knowing Jason, I think he'd be perfectly fine with seeing
> his original idea develop in a possibly different direction, provided
> it will all work nicely in the end. I'm willing to take the beating
> on me if that doesn't turn out to be true ...
>
>>
>> There is one thing I will make perfectly clear here: there should
>> never, ever be enforced lookups for SeqIO (even using caches),
>
> You certainly don't want taxonomy lookups during the parsing stage,
> and also not for the client requesting properties of the species that
> have been parsed with high confidence, i.e., genus and species for a
> straightforward binomial like 'Homo sapiens'.
>
> Writing sequences, IMHO, doesn't have to be as fast. It may be better
> to emit strict format a bit slower rather than sloppy format a bit
> faster.
>
> Upon parsing, one idea could be for the flat file parser to set a
> dirty bit in the parsed out species if the parsed text didn't follow
> strict binomial conventions, hence the parser may have made a mistake
> and if a client requests the information it is better to lookup the
> correct values from a taxonomy database. I.e., you could try with a
> strict regex first that would imply a high-confidence result. If that
> fails you don't give up but mark the result as untrustworthy.
>
>
>> [...]
>> This would have been MUCH easier if all three of us could have gone
>> to the local bar for a beer and discussed it. We should just take
>> the time out to videoconference next time.
>
> You're not honestly suggesting that a videoconference is better than
> having beer together?
>
> Enjoy your trip, and thanks for hanging in there in the discussion, I
> appreciate it.
>
> -hilmar
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list