[Biopython] NCBIXML.parse

Mon May 20 10:53:01 UTC 2013

On Mon, May 20, 2013 at 4:26 AM, Mic <mictadlo at gmail.com> wrote:
> I am sorry, the XML file which I sent was created with one year old Blast
> library. When I run Blast with the following command and a new UniRef90
> library
>
> blastp -query X.aa.snap -db /db/uniprot/uniref90 -evalue 0.00001
> -max_target_seqs 15 -out x.blastp.xml -num_threads 6 -outfmt 5
>
> Please find attached the new XML file ...

Got it, and yes this does use a different ID style:

          <Hit_num>1</Hit_num>
          <Hit_id>UR090:UniRef90_Q9FX16</Hit_id>
          <Hit_def>F12G12.10 protein n=1 Tax=Arabidopsis thaliana
RepID=Q9FX16_ARATH</Hit_def>
          <Hit_accession>UR090:UniRef90_Q9FX16</Hit_accession>
          <Hit_len>308</Hit_len>
          <Hit_hsps>

If you want to change that then I would review how the database
was created (e.g. did you make this BLAST database yourself with
makeblastdb (new) or formatdb (old), and if so what identifiers did
the input FASTA file use?).

It might be simpler to just handle the alternative identifier style
in your script.

> and it looks like that a new schema has been created
> http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/dbfetch.databases?style=xml .
>
> Michal

That was a link to the EDAM ontology - I don't see how that is related
to the NCBI BLAST XML schema?

Thanks,

Peter