[BioPython] blast parsing errors
Michiel Jan Laurens de Hoon
mdehoon at c2b2.columbia.edu
Mon Mar 5 16:49:53 UTC 2007
Julius Lucks wrote:
> 1.) Is the documentation for the new NCBIXML and NBCIWWW up to date?
No it is not. To ensure that the documentation on the website agrees
with the current Biopython release, the idea was to update the
documentation when the next Biopython release comes out. Originally we
were planning to make a new Biopython release as soon as the new
Bio.SeqIO code is done. However, I'd be happy to make a release in the
immediate future without the new Bio.SeqIO, and make another one once
Bio.SeqIO is ready.
> 2.) Why is NCBIXML.parse returning an iterator in this case since there
> is only one result? Or in other words, what are the use cases where an
> iterator is necessary?
If you're parsing multiple Blast search results at the same time. In
other words, if the fasta file for the blast search looked like
> gene1
ATAGCTACG...
> gene2
ATCGATCGATGGCA...
> gene3
....
Such a file can be very large, which is why we are using an iterator
instead of a list.
Now, one may argue that NCBIXML.parse should return a single record
instead of an iterator if there's only one result. Others may argue that
for consistency, it should always return an iterator. Either way is fine
with me. Anybody have a strong opinion about this?
> 3.) How are the fink packages of Biopython maintained?
I don't know. But, it's not too difficult to install Biopython from the
source distribution or from CVS. So if you want to be sure you have the
latest version, you might want to try installing from CVS.
--Michiel.
--
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1130 St Nicholas Avenue
New York, NY 10032
More information about the Biopython
mailing list