CAllTom

In one of my less inspired moments, I started off on the doomed project CAllTom. Frustrated by the enormity of Rails, Merb, and the like, I decided that I wanted to get back to the web's roots and write this site as a CGI script… in C… without any CGI libraries. Oh geez, what a bad idea I knew that to be, but darn it if I couldn't help myself. The idea of owning every single line of code — being able to keep all of it in my head at once — was intoxicating after realizing that most of my web development in Ruby was searching blogs for what corner of the Rails framework could scratch a particular itch.

The basic structure of CAllTom was simple because I knew what I wanted and I wanted it small. Instead of smartly routing between controllers as in Rails, I parsed the URL into a stack of /-separated fields which I popped off as I descended the tree of handlers, each of which expected a dictionary of the GET variables, the HTTP method of the request, and what was left of the path on the stack. Each handler was responsible for delegating to a child (subdirectory) when it could, or creating output itself. For example:

static void
process_root(Dict *get, char *method, char **path) {
  if(path[0] != NULL) {
    if(strcmp(path[0], "pages") == 0)
      process_pages(get, method, path+1);
    else
      show_404();
    return;
  }

  printf("%s: %s\r\n", "Content-type", "text/html;charset=utf-8");
  printf("\r\n");

  printf("<title>AllTom.com</title>\n");
  printf("<h1>AllTom.com</h1>\n");
  printf("<p>Hullo!</p>\n");
}

static void
process_pages(Dict *get, char *method, char **path) {
  if(path[0] != NULL) {
    process_page(get, method, path+1, path[0]);
    return;
  }

  printf("%s: %s\r\n", "Content-type", "text/html;charset=utf-8");
  printf("\r\n");

  printf("<title>AllTom.com Pages</title>\n");
  printf("<h1>My Pages</h1>\n");
  printf("<ul>\n");
  printf("<li><a href=\"/pages/a\">Page A</a></li>\n");
  printf("<li><a href=\"/pages/b\">Page B</a></li>\n");
  printf("</ul>\n");
}

…

The simplicity of CGI is wonderful. Everything about the request is encoded in environment variables, though the formats were sometimes tedious to parse. QUERY_STRING, for example, from which GET variables are extracted, turned out to be a pain thanks to the way special characters are encoded. I stole a very naïve parser from a tutorial on CGI, but it seems to contain some memory leaks which I have yet to track down. The danger in putting a C program with memory errors on the web has not escaped me, but it was in the name of science!

Strangely, the problem of not being able to reliably parse out the data clients were trying to give me wasn't what got me to stop the insanity. What finally killed the project in the end was memory management in my attempt at an interface with MySQL. Plainly, I could not tell from MySQL's documentation who owned the memory occupied by rows returned from the database. After playing with their incomplete C examples and messing with my own code for a few hours, I finally decided that it was not worth my time and I moved to the current iteration of the site, TAllTom.

The source code for CAllTom [1] is still available on GitHub.