Sunday, 26 November 2006

Norman Walshs Review of the 1st Xproc draft

« Pipelines and the flow of automation | Main | Could User generated pipelines be the Excel macros of the internet age? »
Up

I have been reviewing some of the XProc information and draft specs, and came across Norman Walsh's overview of the Xproc 1st draft.

I first love the quote he uses.

Progress isn't made by early risers. It's made by lazy men trying to find easier ways to do something. - Robert Heinlein

Awesome that pretty much sums up everything. When I talk about pipelines, most people ask me why would a developer use a higher level pipeline language over simply writing the code. I always suggest the lazyness and time constraints put on people. Its going to be or should be simplier to write a pipeline that code the whole lot.

Norman uses this example to prove the point.

<p:pipeline xmlns:p="http://www.w3.org/2006/09/xproc"
            name="pipeline">

<p:declare-input port="document"/>
<p:declare-input port="schema"/>
<p:declare-input port="stylesheet"/>
<p:declare-output port="result" step="transform" source="result"/>

<p:step name="xinclude" type="xinclude">
  <p:input port="document" step="pipeline" source="document"/>
</p:step>

<p:step name="validate" type="validate">
  <p:input port="document" step="xinclude" source="result"/>
  <p:input port="schema" step="pipeline" source="schema"/>
</p:step>

<p:step name="transform" type="xslt">
  <p:input port="document" step="validate" source="result"/>
  <p:input port="stylesheet" step="pipeline" source="stylesheet"/>
</p:step>

</p:pipeline>
Can you tell me what that pipeline does? I bet you can, without even reading the specification: it performs XInclude processing on a document, validates it against a schema, transforms it with XSLT, and returns the result.

Norman Walsh also uses a few examples comparing Java vs Xproc. He concludes using this...

I think from an ease-of-use and programmer productivity point of view, pipelines are an obvious win: they're more declarative, they allow application behavior to be modified (within limits) without touching a line of code, and they potentially have much better performance. That last point is probably worth a little exploration. There are at least two ways in which using pipelines can lead to improved performance. One is a generality, if you're coding up your processing directly in Java (or C or Ruby or whatever), then optimization is your problem. If lots of folks are using pipelines then it makes sense to invest resources in improving the performance of the code that implements them. Making that code perform better automatically helps you (and everyone using pipelines). A less immediately obvious benefit arises from the fact that XProc has a fairly large vocabulary of built in steps. They aren't all spelled out in the current draft, but will eventually include steps to add, rename, and delete elements and attributes; process regions of a document based on XPath expressions, combine documents, split documents, extract content, inject content, etc.

A good arguement for the sharing non-proprietary notion which I'm calling for.

Of course the important word way back up there in the paragraph before the pipeline example was “standard”. None of this work is a real win for users unless they can reasonably expect interoperable implementations. I want (desperately sometimes) to be able to distribute pipelines with the same ease that I now distribute XSLT stylesheets.

Posted by ianforrester at 10:28 PM in The meme, idea, or blueprint for a way ahead
« November »
SunMonTueWedThuFriSat
   1234
567891011
12131415161718
19202122232425
2627282930