[BioPython] help with NCBIWWW parser
Edoardo Saccenti
saccenti at cerm.unifi.it
Fri Sep 15 13:23:37 UTC 2006
I realised indeed it was an xml....my fault not to have
read with more attention istructions....
thanks a lot for the time you waist
Edoardo
On Fri, 2006-09-15 at 10:14 +0100, Peter wrote:
> Edoardo Saccenti wrote:
> >> Hi Folks!
> >>
> >> I'm trying to parse the output of blast search done using the NCBIWWW
> >> qblast.
>
> Thanks for sending me the file, it looks like you have got an XML file
> back from the NCBI using NCBIWWW.qblast but you are trying to use the
> HTML parser to read it.
>
> qblast takes an optional argument of format_type which now defaults to
> XML. You can also choose "HTML", "Text", "ASN.1"
>
> If you have plain text output, try NCBIStandalone.BlastParser()
> If you have HTML output, try NCBIWWW.BlastParser()
> If you have XML output, try NCBIXML.BlastParser()
>
> In theory, using XML should be the most reliable as it is a file format
> designed for computers to read.
>
> The HTML output also contains lots of formatting to make it look pretty
> on a web browser - and also changes fairly often.
>
> The plain text output is fairly simple, but again the NBCI makes minor
> changes every so often (and their standalone tools produce a slightly
> different format to the web tools).
>
> I can read your XML file using:
>
> from Bio.Blast import NCBIXML
> blast_out = open("my_blast","r")
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> print b_record.query
>
> I hope that helps,
>
> (If you are submitting multiple queries, then you will need to use an
> iterator... but that is another can of worms).
>
> Peter
>
>
More information about the Biopython
mailing list