[Bioperl-pipeline] some discussion on xml dtd

kiran kiran at fugu-sg.org
Thu Feb 27 17:30:53 EST 2003


Hi,

We need to discuss freeze the xml dtd (a long pending one) for the release
and so the XMLImporter accordingly.
There are currently two different flavours of xml files in the pipeline.
One according to the old dtd which I started with and the second type
according to the one jugang modified.

The difference being in specifying most of the structure either as tags or
in terms of attributes (jugang).

For example :

Option 1: specifying hierarchy as tags
<analysis>
         <runnable>Bio::Pipeline::Runnable::Blast</runnable>
</analysis>

Option 2 : as attributes
<analysis runnable="Bio::Pipeline::Runnable::Blast"></analysis>

>From my knowledge, specifying as tags allows for extensibility without
breaking the heirarchy. It is particularly useful when sharing xml files
with any other application. Almost definitely, there would be a need for the
application specific tags to be embedded in the heirarchy for the use by
application.
For example an automatic pipeline documentation generation program which
takes in the pipeline xml file may need a tag like <runnble_description> or
<runnable_inputs>, which could be added inside the heirarchy if tags are
used as follows.

<analysis>
     <runnable>Bio::PipelineRunnable::Blast
           <runnable_description> A runnable for blast ..
</runnable_description>
    </runnable>
</analysis>
The same file would pass through the pipeline and documentation generator as
each of them would extract what they need from the heirarchy.  If you had
used attributes then you would have another attribute as
<analysis runnable="Bio::Pipeline::Runnable::Blast" runnable_description="a
runnable for blast .."></analysis> which is not so much heirarchical or a
logical grouping.

For a crude analogy, attributes are like hash data structure and tags are
like objects. so design of xml structure is somewhat akin to object oriented
design and so an evolving one.
I feel until the protocol specification becomes mature and pipeline stable
enough, we could have most of the specification in tags to allow for
extensibility and can review and change then if needed.

Do I understand it correctly and am I missing something ?

comments, inputs and suggestions most needed.

Regards,
Kiran




More information about the bioperl-pipeline mailing list