To reward those of you who have struggled through my most recent posts regarding DSLs in general and awk in particular, I give you a short post.
How I'm using awk (in conjunction with other tools) to block the botnet attacking us: I was able to find a regular expression that lets me match attempts by the botnet. So I told my computer to watch the logfile forever, looking for matching lines and blocking the IP address that made the request. Here's what the command line looks like (I've left out the actual grep command so that spammers don't pick up on it).
tail -f /var/log/apache2/sites/thehomestarmy.com | grep '' | awk '{print $1}' | xargs iptables-add-rule.sh
Here, awk is being used for its simplest, but perhaps most popular, feature. It's very intelligent about providing ways to break a piece of text into chunks and making those chunks available for printing. What I've done is asked awk "Please read you input, break it into fields, and print the first field." That first field from my apache logfile format happens to be the IP address of the user requesting the page.
That's it.
To demonstrate how effective these "little languages" are at solving this kind of task, I went ahead and wrote the same tool in perl and C.
Using DSLs (awk, grep, bash): 1 line, 20 words, 156 bytes
Perl solution: 20 lines, 62 words, 566 bytes
C solution: 367 lines, 926 words, 9718 bytes.
DSLs clearly won the day here.