[Biopython] Converting from NCBIXML to SearchIO

Martin Mokrejs mmokrejs at fold.natur.cuni.cz
Thu Feb 13 20:38:34 UTC 2014


Hi,
   I am in the process of conversion to the new XML parsing code written by Bow.
So far, I have deciphered the following replacement strings (somewhat written in sed(1) format):


/hsp.identities/hsp.ident_num/
/hsp.score/hsp.bitscore/
/hsp.expect/hsp.evalue/
/hsp.bits/hsp.bitscore/
/hsp.gaps/hsp.gap_num/
/hsp.bits/hsp.bitscore_raw/
/hsp.positives/hsp.pos_num/
/hsp.sbjct_start/hsp.hit_start/
/hsp.sbjct_end/hsp.hit_end/
# hsp.query_start # no change from NCBIXML
# hsp.query_end # no change from NCBIXML
/record.query.split()[0]/record.id/
/alignment.hit_def.split(' ')[0]/alignment.hit_id/
/record.alignments/record.hits/

/hsp.align_length/hsp.aln_span/ # I hope these do the same as with NCBIXML (don't remember whether the counts include minus signs of the alignment or not)




Now I am uncertain. There used to be hsp.sbjct_length and alignment.length. I think the former length was including the minus sign for gaps while the latter is just the real length of the query sequence.

Nevertheless, what did alignment.length transform into? Into len(hsp.query_all)? I don't think hsp.query_span but who knows. ;)



Meanwhile I see my biopython-1.62 doesn't understand hsp.gap_num, looks that has been added to SearchIO in 1.63. so, that's all from me now until I upgrade. ;)


Thank you,
Martin



More information about the Biopython mailing list