Web Generator

[email protected]

1 Description

This document describes my ideas for an ultimate website generator. I haven’t decided on a name yet.

2 Goals

The whole goal of this system is to make creating correctly-formatted websites and pages as painless as possible, applying the same transformation techniques used by software developers to generate complex pieces of software.

2.1 Automated Features

Computers are all about automation. If you wanted to do things manually, you wouldn’t need a computer.

2.1.1 Program Support

2.2 Source Formats

2.3 Output Formats

It should be capable of generating any kind of file format, really.
It should be capable of creating structured web hierarchies such as Debian/Ubuntu repositories.
It should not put “junk” files (temporary files, etc.) in the staging area; everything it puts there is something that you want to publish. Littering the staging area with things you don’t want published is a privacy hazard, and attempting to filter it (such as rsync’s “-C” cvs-exclude option) simply makes the sync tool more complicated, limits your choice of synchronization tools, and prohibits you from generating the files on your web server itself.
It should allow for arbitrary post-processing of the output files; this is an idiosyncracy of mine but can be useful if, for example, you want to change certain pieces of data such as embedded email addresses and URLs without going back and altering all the input files.

2.3.1 Hierarchies

It should be capable of generating web pages, directory hierachies, web sites, or hierarchies of web sites; in developing something, it is often the case that it starts out as a note or list item, becomes a web page, becomes multiple web pages, becomes a group of related things (often in a directory hierarchy), becomes a homepage, becomes a web site, becomes a hierarchy of virtual domains, etc.
You will note when editing a web page that you will often want to bring a section “up” in the hierarchy, group it with related items (bringing it “down”), or move it around; with HTML, this can be annoying since the heading tags are numbered, and so you must tediously change all the heading tags. Similarly, when editing text documents or source code with nested indentation, you often want to change the level of indentation. A sensible system will make such a common change easy; text editors such as vi and EMACS have commands for changing indentation levels. This web generation system should make restructuring the data in this manner as simple as possible.
In many cases, Makefiles are a good counter-example; typically, a Makefile will list subdirectories into which it should recurse. If you move a subdirectory to another location, you have to (remember to) edit two Makefiles, or else it doesn’t get built. We should avoid that kind of design.

2.4 Transfer Tools

It should emit everything into a staging area where it can be automatically transferred to the server using tools such as rsync or unison.
I have a couple of problems with rsync:

2.5 Free/Open Source

The tool should be free/open-source software, although it should be flexible enough to support any command-line tool, the way “make” can run any Unix program.
It should run on open-source operating systems, especially (but not exclusively) Linux.

2.6 Re-use

This is a huge undertaking, and therefore it should be capable of using existing tools which do part of the job.
It should leverage tools which are already familiar to software developers, such as make, shell scripts, and scripting languages such as python or ruby.

2.7 Flexibility

Since different users will have different desires in terms of input formats and tools, it should be an eminently modular system, that can be customized, extended, and used by a wide variety of authors. As such, it should be easy to modify - no more difficult than a makefile or script.
The one constant is change - particularly in software, and particularly in web technologies. Therefore, it should be designed for the future, to adapt to new formats and transformation programs. There is no way to know exactly what the needs of users will be in, say, five years. There is a reason why nobody writes in Wordstar format any more, but makefiles and shell scripts are still around.

2.8 Self-Promotion

The marketing term for this might be “branding”, but that word makes me gag. Most people decide what programs to use based on what other people are using, especially people whom they respect and admire, or who are doing similar work. By (optionally) emitting a small, tactful link to this program, people reading web pages/sites generated by this system will become aware of it, saving them from re-inventing the wheel, or using inferior systems for authoring and generation. You’ll note that the tool I use for creating this page does a similar thing in the footer, and I think it’s great.

3 Anti-Goals

Do not be oriented towards GUIs... it is a website generator for software developers, not graphic designers.
It should work as automatically as possible, perhaps from a cron job, although some things (like PGP/GPG signing and rsync) may require minimal interaction for inputting passphrases - in that case, it should be a “fire and forget” system.
It should be designed to emit static content: HTML and other files designed to be stored in a file system, not a database. Writing secure server-side software is hard (really really hard), and I am a security weenie.
It should not be a huge, monolithic, hard-to-modify, “closed” system.
It should not involve any kind of domain-specific language or syntax unless the gains in efficiency outweigh the learning curve required.
It should not require writing excessive amounts of code; for example, it should not require mostly-similar Makefiles for every document you want to create. Efficiency is a primary goal, and doing a lot of repetitive work is stupid.
This software shouldn’t assume that you do everything its way. If you want to write raw HTML for part of your site, and the rest of it in something else, fine. People tend to start writing in one source format or with one system and migrate to others over time. This system shouldn’t make you convert all your previous stuff to its way of doing things, and it shouldn’t prevent you from easily migrating to another system in the future. In other words, it shouldn’t be a dick.