[Bioperl-l] BioPerl parse interproscan xml not working
blpapery at gmail.com
blpapery at gmail.com
Wed Nov 6 10:35:42 EST 2013
Hi all,
I have been trying to use Bio::SeqIO to parse an XML interproscan result
(XML version 1.0 is what interproscan outputs),
but I keep getting the following error:
no element found at line 24, column 0, byte 1421 at
/System/Library/Perl/Extras/5.10.0/darwin-thread-multi-2level/XML/Parser.pm
line 187
My code is below:
use Bio::SeqIO;
$io = Bio::SeqIO->new(-format => "interpro",-file => "ipr.xml");
while ($seq = $io->next_seq) {
print $seq->accession; # trying to print out anything here
}
XML file is shown below:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<protein-matches
xmlns="http://www.ebi.ac.uk/interpro/resources/schemas/interproscan5">
<protein>
<sequence
md5="d95d12290aaa87a91f47d25299cfb6ce">MKYKHLILSLSLIMLGPLAHAEEIGSVDTVFKMIGPDHKIVVEAFDDPDVKNVTCYVSRAKTGGIKGGLGLAEDTSDAAISCQQVGPIELSDRIKNGKAQGEVVFKKRTSLVFKSLQVVRFY
DAKRNALAYLAYSDKVVEGSPKNAISAVPVMPWRQ</sequence>
<xref id="ecoli_3"/>
<matches>
<hmmer3-match evalue="1.0E-57" score="193.0">
<signature ac="PF05981" desc="CreA protein" name="CreA">
<entry ac="IPR010292" desc="Uncharacterised protein
family CreA" name="Uncharacterised_CreA" type="FAMILY"/>
<models>
<model ac="PF05981" desc="CreA protein"
name="CreA"/>
</models>
<signature-library-release library="PFAM"
version="27.0"/>
</signature>
<locations>
<hmmer3-location env-end="157" env-start="24"
score="192.8" evalue="1.2E-57" hmm-start="1" hmm-end="128" hmm-length="0"
start="24" end="156"/>
</locations>
</hmmer3-match>
</matches>
</protein>
</protein-matches>
Thanks in advance for your help.
Ben
More information about the Bioperl-l
mailing list