RuDi project Index Page

RuDi stands for Ruby-based Utilities for DITA processing. The idea is create a build system for DITA projects that is easy to use, and the same time very powerful--something that is accomplished by using the Ruby language and Ruby-based tools. (The project was originally hosted at Kenai. Until it finds a new home the project artifacts will be hosted here.)

Eric Armstrong


DITA is an impressive design feat. It defined a way to link many files together in flexible ways, while ensuring type-safety in the process. In other words, it guarantees that a text segment that is "transcluded" into a procedural topic will conform to the structural restrictions defined for that kind of topic. The ability to divide material into reusable chunks, and to guarantee that structural restrictions are honored, is a compelling feature of DITA.

At the same time, DITA adoption suffers from the need for expensive tools. The DITA processing tool (the DITA Open Toolkit) generates passable sample code, but falls short of production-quality output. Meanwhile, the need to include stylistic information renders the transformations more complex. Tools that solve the processing problem are expensive, because of the time required to create proprietary fixes for issues in Open Toolkit processing.

Another area of enormous expense is the need for a Content Management System. A collection of DITA documents is nothing so much as a mass of interconnected links. But when a file name changes, every file that links to it needs to change. And when a file changes locations, relative links within the file need to change, along with all of the files that link to it. And if, heaven forbid, a file is split into two, all of the links that refer to the orginal file need to be inspected, to see whether it is appropriate to link to the first file, the second, or both.

The RuDi project was created to address such problems. (It has been dormant for quite a while. But the problems it was designed to address are still present, and it does represent significant strides towards a solution, hence its continued availability.) To help achieve that goal, it is primarily built using the Ruby programming language.

Ruby's claim to fame is its ability to create domain-specific language--mini-languages that are designed and customized for a specific purpose. Some of the better known examples are:

Those tools make it easy to express the solution for the problem you are trying to solve. The ease of expression translates directly into rapid construction of new solutions, and ready comprehension of existing ones. (For more, see Ruby Rocks.)

Project Goals

The overarching goal is create a DITA-processing system that produces professional-quality results, affordably. To do that, it addresses:

  1. Document Generation
    The first goal is to make it easy to transform DITA files into HTML-based output, by separating stylistic design from code processing. Using DreamWeaver templates lets a professional designer work with visual tools. It also separates the design task from code transformations, which makes the transformations simpler. And after the transformations are complete, DreamWeaver will automatically apply template-changes to existing files. So the system achieves both automation and a desirable separation of concerns.
  2. Link Management
    The second goal is automate link processing, to ensure that links remain accurate when files change names or locations--and to do without requiring a mega-expensive content management system. The idea is to automatically generate and run a link-processing script when such changes have been made in a change-management system like Subversion. (To prevent links from being automatically adjusted, the changes can be made outside the system.)
  3. File System Storage
    Using a file system for file storage has significant advantages over a database, primarily in the ability to create and run automated scripts on the collection of files. With the problems of document generation and link management solved, the need for an expensive Content Management system dissipates, leaving the file system as a far less expensive choice.

    A unix-based system is ideal for this purpose. In particular, it allows the constrution of symlinks that act as a local stand-in for a remote file. That capability is needed to share common topics across DITA maps. Apparently the DITA Open Toolkit has a bug that surfaces if you try to use a conref to link in a file that resides outside the root of the map.
  4. Proofreading
    The proofreading task can be made much simpler and easier with a list-based search-and-replace tool. That way, a list of common problems can be inspected, and changes can be made selectively. (Ideally, it will integrate with the authoring tool, so that surrounding text can be modified at the same time.)
  5. Improved automation for software-documentation systems.
    For software documentation, there is tremendous scope for automation that has gone largely untapped. For one thing, it should be possible to run tests on sample programs, to be sure they still work as the product changes. Then, when the sample program is revised, it should be possible to automatically replace any sections of that code that were included in a document (so that the sample will be guaranteed correct) and, at the same time, alert the writer to places where changes have been made, so the surrounding text can be inspected for accuracy.

    Another automatable-task surfaces for a tutorial a program that is built up in stages. Such programs are generally be built up in stages, working from a simple starting version and building out to a more complex final version, introducing the reader to new concepts at each stage. It is helpful to maintain a single source copy for such a program, and to generate each version

    A third area that is ripe for automation is UI integration. At a minimum, the documentation should be referencing files used in the UI, to ensure that labels are correct. Ideally, those files should also define structural paths, so that instructions like "Click {menu} > {choice} > {tab} > {button}" are guaranteed to be accurate.

Existing Contents

Future Contents

§ Home  ·  Books  ·  Health  ·  Music  ·  Dance  ·  Golf  ·  Yoga  ·  Essays  ·  Store §

Copyright © date by Eric Armstrong. All rights reserved.
Contact me to send feedback, make a donation, or find ways to help others.
And by all means, be sure to visit The TreeLight Store.