toolbox: webdev unicorns: components (3/4)

The web is a big place, and being a part of it increases the value of both the network your application interacts with and your application. This creates a need to interoperate with other applications on the web, the fulfillment of which actually helps create a better end-user experience. What this boils down to are components that help consume and create webstyle software. Here, I’ll focus mainly on XML tools for python.

I’ve been trying to imagine what applications would look like when they are really “scaled out”. For instance, is it possible to overlay an object graph over the web? A useful construct in programming is the idea of references and pointers. Can URIs be pointers and resources be instances? Linked Data may be the way to understand this paradigm. Are there any tools available that allow you to dereference a URI as if your were dereferencing a pointer in memory?

Patterns for Exchange

If webstyle software is about exchanging messages to facilitate scaling out, what are the idioms that enable these capabilities? What format should the messages conform to? What tools facilitate manipulating these messages? Ultimately, consumption patterns are up to the market. If the web is a market, then the most successful style is RESTful architecture, and the most successful format has been HTML. However, HTML isn’t very good at describing data while XML is. XML has problems of its own, but there are good tools for manipulating XML. RDF may provide solutions, but I don’t see many tools readily available to do the kinds of things I’m about to describe. Therefore, for now I’m primarily concerned with POX.

Parsing XML

The amara toolkit will turn an XML document into a native python object. It’s extremely natural use allows subclassing the base element type. This facilitates mixing methods into various places of your applicaton’s data structure. Python provides facilities to re-route some parts of the object tree, using __getattr__ and friends. Using __getattr__ means you can create functions to lazily deference nodes, from a database or over the web, as necessary.

Validating XML

The amara toolkit is based on the 4suite library. The library also provides a number of other XML tools, including an XSLT implementation. I haven’t compared the performance of their XSLT implementation to any of their competitors. Are there any I should know about? Since amara can effectively provide our application with the models that may be necessary for performing business logic, how do we ensure that the API is what we expect it to be? That’s where XSLT comes in. Schematron is an xml technology to a.) define XML schemas and b.) validate XML documents. Basically, a schematron document defines a bunch of rules to assert validity of an XML document. These rules then compile into an XSL document, which can then transform an XML document into a report describing it’s validity. Not only does the report conclude whether or not the document is valid, but it also produces a rich analysis of where the document contains errors and/or the features this document exhibits.

Generating XML

Now that we have ways to produce APIs from XML documents, it would be nice to have a convenient way to generate XML. Most XML tools, such as DOM, or even my beloved Amara, are kind of clumsy at this task, but Stan may be the solution. I haven’t actually tested out any code yet, so I just don’t know yet. Stan looks like Mochikit’s DOM helpers, where you can create an element simply by calling a function with the name of the element.

I’ve tried other XML tools, but these are the ones that currently look best to me. I’ve actually begun using schematron and amara in some projects, and so far I’ve been quite pleased with the results. What I’m interested in doing is a kind of ORM-over-the-web, where models and their APIs are generated for me, with a minimum effort (good programmers are lazy, right?). I believe RDF is probably more suited as a format, but I simply haven’t gotten around to all the learning that needs to happen in order to do that. In addition, these tools are readily available, and a lot of web api publishers are already publishing POX.

This is the fourth in a five part series discussing the web, tools to develop it, and why its important, in the context of python, code, web servers, templating, and xml technologies.

intro (0/4)
The first post, in which I explain what I’m doing.
templates (1/4)
The second post, in which I discuss templating technologies.
servers (2/4)
Third post, in which I discuss server technology, and methods I like to expose code to the web, and some performance implications.
components (3/4)
In which I discuss XML technologies for python.
conclusion (4/4)
In which I briefly mention why I’m passionate about good technology.

Post a Comment

Required fields are marked *

%d bloggers like this: