LinkRewriterTransformer
Summary
Rewrites URIs in links to a value determined by an InputModule. The URI scheme identifies the InputModule to use, and the rest of the URI is used as the attribute name.
Basic info
| Component type | Transformer |
| Cocoon block | linkrewriter |
| Java class | org.apache.cocoon.transformation.LinkRewriterTransformer |
| Cacheable | No |
Documentation
The LinkRewriterTransformer can help in situations where one resource needs to link to another resource; but you want to avoid hard-coding the actual location of the other resource.
The transformer was originally developed as part of the Apache Forrest project. A way was needed to allow documents to refer to other documents such that changing the location or rendering a document differently didn't break the link. The solution was to give those references a separate namespace and transform them as necessary when the target's final address became known. So for example, instead of one document referring to another like this: <a href="aPage.html"> it could use <a href="site:aPage"> where:
- site - is a URI "scheme"; a namespace that restricts the syntax and semantics of the rest of the URI. The semantics of 'site' are "this identifier locates something in the site's XML sources".
- aPage - identifies the content by reference. We call this indirect, or semantic linking because instead of linking to a physical representation (e.g. aPage.html), it refers to the concept of the aPage resource. It doesn't matter where it physically lives (although it is assumed you can retrieve it based on this identifier if needed).
If that was all this transformer could do, it wouldn't be anything that you couldn't already do with an XSLT transformer and suitable stylesheet. The real power of the LinkRewriterTransformer occurs when it is configured to work in combination with one or more InputModules. This ability lets the LinkRewriterTransformer look up the new value of a link attribute at request time. Depending on the inputmodule used, the new value may be retrieved from a number of datasources including files, databases or some other form of repository.
Transformer Configuration
The following configuration entries in <map:transformer> block are recognised:
- link-attrs - Space-separated list of attributes to consider links (to be transformed). The whole value of the attribute is considered a link and transformed.
- link-attr - 0..n of these elements each specify an attribute containing link(s) (to be transformed) and optionally a regular expression to locate substring(s) of the attribute value considered link(s). Has two attributes:
- name - (required) name of the attribute whose value contains link(s).
- pattern - (optional) regular expression such that when matched against the attribute value, all parenthesized expressions (except number 0) will be considered links that should be transformed. If absent, the whole value of the attribute is considered to be a link, as if the attribute was included in 'link-attrs'.
- schemes - Space-separated list of URI schemes to explicitly include. If specified, all URIs with unlisted schemes will not be converted.
- exclude-schemes - Space-separated list of URI schemes to explicitly exclude. Defaults to 'http https ftp news mailto'.
- bad-link-str - String to use for links with a correct InputModule prefix, but no value therein. Defaults to the original URI.
- namespace-uri - The namespace uri of elements whose attributes are considered for transformation. Defaults to the empty namespace ("").
- input-module - 0..n (possibly nested) inputModule configuration information. Associates an input module with this transformer.
The attributes considered to contain links are a set of the attributes specified in 'link-attrs' element and all 'link-attr' elements. Each attribute should be specified only once either in 'link-attrs' or 'link-attr'; i.e. an attribute can have at most 1 regular expression associated with it. If neither 'link-attrs' nor 'link-attr' configuration is present, defaults to 'href'.
Below is an example of regular expression usage that will transform links x1 and x2 in <action target="foo url(x1) bar url(x2)"/>:
<map:transformer name="linkrewriter"
src="org.apache.cocoon.transformation.LinkRewriterTransformer">
<link-attr name="target" pattern="(?:url\((.*?)\).*?){1,2}$"/>
<!-- additional configuration ... -->
</map:transformer>
When matched against the value of target attribute above, the parenthesized
expressions are:
$0 = url(x1) bar url(x2)
$1 = x1
$2 = x2
Expression number 0 is always discarded by the transformer and the rest are considered links and re-written.
If present, map:parameter's from the map:transform block override the corresponding configuration entries from map:transformer. As an exception, 'link-attr' parameters are not recognised; 'link-attrs' parameter overrides both 'link-attrs' and 'link-attr' configuration.
This transformer may be used to convert the URIs containing the servlet: protocol to access blocks into browser-recognisable URIs
InputModule Configuration
Example
Suppose we had an XMLFileModule, configured to read values from an XML file containing the following fragment:
<site>
<faq>
<how_to_boil_eggs href="faq/eggs.html"/>
</faq>
</site>
mapped to the prefix 'site:', then <link href="site:/site/faq/how_to_boil_eggs/@href"> would be replaced with <link href="faq/eggs.html">
InputModules are configured twice; first statically in cocoon.xconf, and then dynamically at runtime, with dynamic configuration (if any) taking precedence. Transformer allows you to pass a dynamic configuration to used InputModules as follows.
First, a template Configuration is specified in the static <map:components> block of the sitemap within <input-module> tags:
<map:transformer name="linkrewriter"
src="org.apache.cocoon.transformation.LinkRewriterTransformer">
<link-attrs>href src</link-attrs>
<schemes>site ext</schemes>
<input-module name="site">
<file src="cocoon://samples/link/linkmap" reloadable="true"/>
</input-module>
<input-module name="mapper">
<input-module name="site">
<file src="{src}" reloadable="true"/>
</input-module>
<prefix>/site/</prefix>
<suffix>/@href</suffix>
</input-module>
</map:transformer>
Here, we have first configured which attributes to examine, and which URL schemes to consider rewriting. In this example, <a href="site:index"> would be processed. See below for more configuration options.
Then, we have established dynamic configuration templates for two modules, 'site' (an XMLFileModule and 'mapper' (A SimpleMappingMetaModule. All other InputModules will use their static configs. Note that, when configuring a meta InputModule like 'mapper', we need to also configure the 'inner' module (here, 'site') with a nested <input-module>.
There is one further twist; to have really dynamic configuration, we need information available only when the transformer actually runs. This is why the above config was called a "template" configuration; it needs to be 'instantiated' and provided extra info, namely:
- The {src} string will be replaced with the map:transform @src attribute value.
- Any other {variables} will be replaced with map:parameter values
With the above config template, we can have a matcher like:
<map:match pattern="**welcome">
<map:generate src="index.xml"/>
<map:transform type="linkrewriter" src="cocoon:/{1}linkmap"/>
<map:serialize type="xml"/>
</map:match>
Which would cause the 'mapper' XMLFileModule to be configured with a different XML file, depending on the request.
Similarly, we could use a dynamic prefix:
<prefix>{prefix}</prefix>
in the template config, and:
<map:parameter name="prefix" value="/site/"/>
in the map:transform
A live example of LinkRewriterTransformer can be found in the Apache Forrest sitemap.

There are no comments.