You are reading, a personal blog about software development.

Document like a programmer

Software documentation is serious business. I am not sure exactly why software documentation tends to take a backseat to implementation, apart from the fact that many software developers simply prefer to write code rather than prose. On the other hand I don't think all teams need to have the same level of documentation, so don't take this post as a general recommendation to write a lot of documentation. But when you do need to write documentation, it should definitely be as painless as possible.

On many teams it isn't. As a de facto standard, documents are typically born in Word. They typically start out small, but soon grow larger. Ten years ago people would say that Word could not handle large documents, but I think Word have improved in this respect. However, there are other reasons why I think Word is not always the best tool for large collaboratively edited documents, for example:

Occasionally it will be necessary that more than one person can edit the same document at the same time. To my knowledge this is just not possible. One workaround is (if you are the second user to open the document) to makes a copy of the document. Then make your changes in the copy and merge the changes back into the main document later (and pray that the part you changed were untouched by the other user). Another workaround is to simply split the document into multiple smaller documents, which can then be referenced from a main document.

At some point someone will want to base-line the document, e.g. to remember how the document looked when the product was initially released. This is not supported by Word directly. The typical workaround is to make a read-only copy of the document (or the documents, see 1) and rename it using some kind of naming convention. While moving files around on the network drive, be very careful not to drag a random folder into another random folder by accident - this is a classic.

On my team we struggled with these issues and it occurred to us that we never faced the same problems with our source code, even though our source code is really a much larger document shared by a much larger group. As programmers we work mostly with code and less with documentation. This means that our coding workflow is highly optimized and supported by a wide range of tools, whereas our documenting workflow is something that we rarely pay much attention to. In an attempt to align our documenting workflow to our coding workflow, we decided to convert our documents to a lightweight markup language (we went with AsciiDoc) and check them into GIT. The key benefit of documenting this way is that we can use the tools of our most familiar workflow to produce documentation:

Text Editors

One example is text editor. The choice of text editor is highly individual as it relies on both taste and past experience. A developer is typically highly efficient with the editor he uses often. The use of markup language allows the developer to choose exactly that editor.

Build Tools

Another example is the build tools, such as make. We use build tools all the time to automate compilation of source code, especially when we have many source files and several build parameters specifying the particular build we want to make. We can use this for generating documentation output as well. In our setup we have different build targets for specifying the output format (pdf or html). It is easy to imagine that we could use build parameters to generate different parts (or different views) of the source document.

Review Tools

Many teams use text based tools for reviewing source code changes, e.g. reviewboard. Those tools have the advantage that reviewers can see the original and changed lines side by side and comment on the changes line by line. When your document is expressed as markup you can review document changes in exactly the same way. I should mention that this does imply reading the markup code, rather than the rendered output, but for smaller incremental changes I think this is preferable. For an initial document version or larger rewrites I would still recommend rendering the document output and organizing a face to face meeting.

Version Control System (VCS)

We would often ask questions like: "Who changed this paragraph?", "Was this section changed since last release or did we not update it yet?" or "Why was this section removed?". Before we were't able to answer those kinds of questions. Exactly the same questions we were asking about our code 10 years ago before we first checked it in to GIT. Doing the same with our documents provide us full history of all changes and even allows us to revert them individually after days or weeks. I can't begin to describe how liberating that feels. Another useful feature of any VCS is that they allow content to be tagged - conveniently replacing the need for making read-only copies of all our previous Word documents.

I should mention that there is one reason why all documents are not already written in clear text: Not all documents are maintained only by programmers. I would be vary of using a lightweight markup language for a document which is maintained also by non-technical collaborators (even though it could be possible in some cases) and I would downright discourage the use of GIT for the same group of people due to its learning curve.