Smarter Site Monitor in Bash

For the last number of months I’ve been using my own site monitor. It’s been great and terrible all at the same time. It’s great as it’s uncovered many issues that were missed by Jetpack Monitor. That’s due to my strict requirement for sites to end in a proper </html> tag and a higher sensitivity to SSL errors. It’s terrible because the notifications are really dumb. I literally receive an email for every failed check 😂. That means any site with a prolonged issue sends me an email at every check. Suffice it to say I was only running the site monitor once per hour which isn’t as often as I would like.

Building smarter notifications with JSON and PHP

In order to add some smarts, the site monitor needs memory. I could have added a database, however I really want something super simple. I decided to stick with a single text based file monitor.json which, combined with a PHP script, operates as a crude database.

PHP can read in any JSON data using json_decode(), work with it as PHP objects and arrays, make changes to to the data then save it back to JSON using json_encode(). I’m using this workflow to read in my monitor.jsontrack and count the sites which are offline then save back to monitor.json. Here is an example of what monitor.json looks like when there are ongoing issues.

[
    {
        "url": "https:\/\/my-site-1.tld\/",
        "http_code": "404",
        "html_valid": "true",
        "check_count": 18,
        "notify_count": 2,
        "created_at": "1552750236",
        "updated_at": "1552762984"
    },
    {
        "url": "https:\/\/my-site-2.tld\/",
        "http_code": "500",
        "html_valid": "true",
        "check_count": 1,
        "notify_count": 1,
        "created_at": "1552762984",
        "updated_at": "1552762984"
    }
]

The created_at and notify_count fields allow for more selective notifications. I have tuned my code to send a notification at the following marks: initially, 1 hour, 4 hours, 24 hours and finally once it’s been restored. All other time frames are ignored.

With these improvements I am now safely checking my customers every 10 minutes and receiving more relevant notifications 👍. I may eventually bump it to even shorter intervals.

If you’re interested in the code then check out these files.

CaptainCore is still under heavy development 👨‍💻. You can follow my monthly updates over at https://captaincore.io.