Extracting Stats from a Fathom Instance

Back in October of 2018 Fathom, the open source web analytics project, added support to track multiple websites per instance. Since then I’ve been collecting site stats for all of my WordPress customers with intent to reduce my reliance on Jetpack Stats. Rather than wait for an official Fathom API I decided to see how far I could get with some crude SQL and SSH scripts in extracting stats. Here is what I came up with.

My goal was to extract yearly pageviews per site.

Currently I’m pulling this data is from Jetpack via WordPress.com’s API. Having this information is necessary considering I charge web hosting plans based on pageviews. After some trial and error with Fathom I was able to get similar stat info using the following SQL statement. I’m not entirely sure why the time zone conversion was necessary but without it the pageviews were slightly off.

SELECT sum(pageviews) as pageviews,
	   sum(visitors) as visitors,
	   CONVERT_TZ(ts,'+00:00','-05:00') as tsOffset,
	   month(ts),
	   year(ts)
FROM site_stats
WHERE ts <= "2019-01-01 04:00:00" AND site_id = 1
GROUP BY month(tsOffset), year(tsOffset)
ORDER BY tsOffset DESC
LIMIT 12
Manual SQL query grouped by month for individual site
Same stats shown in Fathom

Armed with a useable SQL query I began working on an extraction method.

I currently run my Fathom instance on a cheap Digital Ocean droplet. Rather then put extra complicated code and load on that server I settle on a daily database export/import locally. This would be sufficient for now but I’ll most likely replace once a proper API is available.

The following steps are performed locally in order to sync down a copy of the live Fathom instance. To begin create a .env file with the following info.

FATHOM_DATABASE_USER="root"
FATHOM_DATABASE_PASSWORD="local_root_mysql_password"
FATHOM_DATABASE_NAME="fathom"

Next create an empty database for Fathom.

source ~/.env
mysql -u $FATHOM_DATABASE_USER --password=$FATHOM_DATABASE_PASSWORD
create database fathom;

The entire database backup can be quite large. In order to keep the export lightweight I decided to only export 2 tables site_stats and sites. Create the following fathom-db-backup.sh.

#!/bin/bash

# Pull in Fathom db credentials
source ~/.env

cd ~/backup

# Database backup
mysqldump -u $FATHOM_DATABASE_USER --password=$FATHOM_DATABASE_PASSWORD $FATHOM_DATABASE_NAME site_stats sites > fathom_mysql.sql

The script sync-fathom-db.sh runs the remote backup and import locally is shown here. Create the following sync-fathom-db.sh and change variables as needed.

#!/bin/bash

# Variables
username=username
address=fathom-server-url.tld
port=22
remote_directory=backup
local_directory=~/Tmp

# Generate fresh fathom database backup on remote
ssh -p $port $username@$address "bash -s" < fathom-db-backup.sh

# Pull down backup
scp -o StrictHostKeyChecking=no -P $port $username@$address:${remote_directory}/fathom_mysql.sql ${local_directory}/fathom_mysql.sql

# Pull in local Fathom db credentials
source ~/.env

# Import database locally
mysql -u $FATHOM_DATABASE_USER --password=$FATHOM_DATABASE_PASSWORD $FATHOM_DATABASE_NAME < ${local_directory}/fathom_mysql.sql

Finally, schedule this to repeat daily by adding the following to crontab.

30 00 * * * /path/to/scripts/sync-fathom-db.sh

PHP script to talk to local copy of Fathom database

Up to now, all that’s been shown is how to sync a production Fathom database to a local copy via SSH scripts. Now that that’s configured and running automatically we can fetch stats from the local database using a PHP script.

Before even building the script I imagined that it could work something like php fetch-fathom-stats.php <domain> from the command line. What I actually end up writing is quite crude however it’s gets the job down.

#! /usr/bin/env php
<?php

#
#   Fetch yearly average stats from Fathom database
#
#   `php fetch-fathom-stats.php <domain>`
#

parse_str( implode( '&', array_slice( $argv, 1 ) ), $_GET );

# Defaults

if (isset($argv[1])) { $domain = $argv[1]; }

if ( $domain ) {

    // Extracts db info from local .env file
    $file = $_SERVER['HOME'] . '/.env';
    if ( file_exists( $file ) ) {

        $file = file_get_contents( $file );
        // Matches config keys and values
        $pattern = '/(.+)=\"(.+)\"/';
        preg_match_all( $pattern, $file, $matches );

    }
    $FATHOM_DATABASE_NAME     = $matches[2][ array_search( 'FATHOM_DATABASE_NAME', $matches[1] ) ];
    $FATHOM_DATABASE_USER     = $matches[2][ array_search( 'FATHOM_DATABASE_USER', $matches[1] ) ];
    $FATHOM_DATABASE_PASSWORD = $matches[2][ array_search( 'FATHOM_DATABASE_PASSWORD', $matches[1] ) ];

    try {
        $pdo = new PDO("mysql:host=127.0.0.1;dbname=${FATHOM_DATABASE_NAME}", $FATHOM_DATABASE_USER, $FATHOM_DATABASE_PASSWORD );
    } catch ( PDOException $e ) {
        die( $e->getMessage() );
    }

    $statement = $pdo->prepare("select id from sites where name = ?");
    $statement->execute([$domain]);
    $site = $statement->fetch();

    // Bail if site not found
    if ( ! $site ) {
        return true;
    }
    
    $site_id = $site["id"];

    // Prepare timestamp for query ending at the beginning of current month
    $timestamp = date('Y') . "-" . date('m') . '-01 04:00:00';

    $sql = "SELECT sum(pageviews) as pageviews,
    sum(visitors) as visitors,
    CONVERT_TZ(ts,'+00:00','-05:00') as tsOffset,
    month(ts),
    year(ts)
    FROM site_stats
    WHERE ts <= '$timestamp' AND site_id = $site_id
    GROUP BY month(tsOffset), year(tsOffset)
    ORDER BY tsOffset DESC
    LIMIT 12";

    $statement = $pdo->prepare($sql);

    $statement->execute();

    $stats = $statement->fetchAll();

    // Bail if stats not found
    if ( ! $stats ) {
        return true;
    }

    $month_count = count($stats);
    $total_pageviews = array_sum(array_column($stats,'pageviews'));
    $monthly_average_pageviews = round($total_pageviews / $month_count);
    $yearly_average_pageviews = $monthly_average_pageviews * 12;


    // Return yearly average pageviews
    if ($yearly_average_pageviews > 0 ) {
        echo $yearly_average_pageviews;
    } else {
        echo '0';
    }

}