CGI scripting in awk?!

Awk is a frequently underestimated scripting language.

Although it's been around since the 1970s, it never really caught on except as an occasional adjunct to bourne shell scripts. This is a shame because it has what I consider to be the right attributes for a good scripting language: code requires very little boilerplate; there are just a few concepts to learn. The original awk paper describes all of its concepts in just 7 page. Of course, new features have been added to awk since then, but even with those changes included, the POSIX awk definition is still refereshingly short.

When I recently had a small CGI script to write, I decided to try awk. I was inspired by some  ambitious wikis written in awk. It turned out to be pleasantly simple. The resulting code demonstrates some of awk's strengths, like its string manipulation and associative arrays. For example, here's code for parsing of the CGI query string.


function cgi_parse(query, defaults, out, params, i, j, count, param, val) {
        count = 0
        for (i in defaults) {
                out[i] = defaults[i]
        }

        split(query, params, "&")
        for (i in params) {
                j = index(params[i], "=")
                if (j <= 0) continue

                param = url_decode(substr(params[i], 1, j - 1))
                val = url_decode(substr(params[i], j + 1))
                if (param in defaults) {
                        out[param] = val
                        count++
                }
        }
        return count
}

Awk's limitations are also apparent here. For example, the declaration of local variables as parameters to the function is ugly. Also, arrays aren't first class values - they can't be returned in a function, and we can't copy an entire array with a single assignment. Instead, we copy the defaults array into the out array one entry at a time.

Attached in cgi.awk are a couple of small routines that can be useful in CGI scripts. cgi_parse() is the above function that parses the CGI query string into an array. html_encode() escapes strings so that they can be safely included inside HTML. An example CGI script using these routines is in the attachment cgi_example.awk. You'll need a tiny shell wrapper to run them from a web server.


#! /bin/sh -
awk -f cgi.awk -f cgi_example.awk

cgi.awk cgi_example.awk