[Bioperl-l] Should Bio::SeqIO::interpro 'use XML::DOM::XPath'?

Mark Johnson johnsonm at gmail.com
Wed Jul 18 20:53:00 UTC 2007


The output from InterProScan, invoked thusly:

iprscan -cli -seqtype p -i input_file -o output_file -format xml

On 7/18/07, Emmanuel Quevillon <tuco at pasteur.fr> wrote:
> Hi guys,
>
> I read your email and I wondered which iprscan file you've
> been talking about? Is it the file produced by InterProScan
> or the file called match.xml representing the whole uniprot
> database against InterPro? Reading the xml parser
> implemented into Bio::SeqIO::interpro, I guess it is the
> second one?
> In such case, I just want to let you know that the xml
> schema changed and the file name also. It is now called
> match_complete.xml.
> I attached the DTD to be able to see the new structure.
> Here is an example of the new data representation.
>
>
> <protein id="A0A000" name="A0A000_9ACTO" length="394"
> crc64="F1DD0C1042811B48">
>      <match id="G3DSA:3.40.640.10"
> name="PyrdxlP-dep_Trfase_major_sub1" dbname="GENE3D"
> status="T" evd="HMMPfam">
>        <ipr id="IPR015421" name="Pyridoxal
> phosphate-dependent transferase, major region, subdomain 1"
> type="Domain" />
>        <lcn start="52" end="288" score="4.30000170645879E-75" />
>      </match>
>      <match id="PTHR13693:SF7" name="PTHR13693:SF7"
> dbname="PANTHER" status="T" evd="not_rel">
>        <lcn start="33" end="389" score="0.0" />
>      </match>
> </protein>
>
> As you can see some time there is no interpro info (no ipr
> element).
>
> I think it would be good to change also the interpro parser ?
>
> Regards
>
> Emmanuel
>
> Chris Fields wrote:
> > On Jul 17, 2007, at 1:23 PM, Mark Johnson wrote:
> >
> >> I'm tinkering with parsing iprscan reports with BioPerl.  I noticed
> >> that this:
> >>
> >>   my $seqio = Bio::SeqIO->new(-file => $iprscan_file, -format =>
> >> 'interpro');
> >>
> >>   while (my $seq = $seqio->next_seq()) {
> >>       ...
> >>   }
> >>
> >> Does not work unless I first 'use XML::DOM::XPath'.  I get this error:
> >>
> >>   Can't locate object method "findnodes" via package
> >> "XML::DOM::Document" at
> >> bioperl-cvs/bioperl-live//Bio/SeqIO/interpro.pm line 136, <GEN0> line
> >> 30.
> >>
> >> I see that Bio::SeqIO has 'use XML::DOM', but that doesn't seem to
> >> suck in XML::DOM::Xpath.  I see that t/interpro.t requires
> >> XML::DOM::XPath:
> >>
> >> test_begin(-tests => 17,
> >>                 -requires_module => 'XML::DOM::XPath');
> >>
> >> Is suppose the reason the test specs a require XML::DOM::XPath is so
> >> that tests can be skipped if XML::DOM::XPath is not available.
> >> Shouldn't Bio::SeqIO::interpro 'use XML::DOM::XPath', though?
> >
> > You're right; I think tests passed b/c XML::DOM::XPath (if present),
> > was eval'd as a required module.  When I commented out the spot where
> > it is eval'd in the test suite I can replicate this error.  I have
> > added 'use XML::DOM::XPath' to SeqIO::interpro now in CVS and it
> > passes fine.
> >
> > Thanks for the heads up!
> >
> > chris
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>



More information about the Bioperl-l mailing list