[Biopython] help with NCBIXML.parse

ferreirafm at usp.br ferreirafm at usp.br
Wed Mar 28 14:03:51 UTC 2012


Hi Peter,
Thanks for answer.


Citando Peter Cock <p.j.a.cock at googlemail.com>:

> You seem to be calling BLAST multiple times in a loop and
> trying to give it SeqRecord objects.

Yes, because I want just only one hit per sequence. If someone has a  
overcome to this, it would be great. If a run it with a multiple fasta  
file, I'll take several hits per sequence. Like this:

P02977  emm1.22.pep     100.00  2       0       0       15      16      
  90      91      9.4      9.2
P02977  emm1.22.pep     100.00  2       0       0       14      15      
  104     105     9.4      9.2
P02977  emm1-2.3.pep    62.50   8       3       0       8       15      
  196     203     0.033   17.5
P02977  emm1.23.pep     62.50   8       3       0       8       15      
  196     203     0.033   17.5
P02977  emm1-2.4.pep    100.00  2       0       0       15      16      
  99      100     5.0      9.2
P02977  emm1.24.pep     100.00  2       0       0       15      16      
  88      89      7.5      9.2
P02977  emm1.24.pep     100.00  2       0       0       14      15      
  102     103     7.5      9.2
P02977  emm1.25.pep     100.00  2       0       0       15      16      
  81      82      4.3      9.2




> It wants FASTA files,
> and you can call BLAST once with a single FASTA query
> file (containing multiple records) and a single database or
> FASTA subject file (also containing multiple records).
>
> As to the specific error, did you look at your blast_out.xml
> file and what it said on line 88?
>

line 88 is a second "header" of the xml file. It seems xmlparse can't  
handle it.

</BlastOutput><?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"  
"http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd">
<BlastOutput>


> Peter
>





More information about the Biopython mailing list