From Google Analytics to Matomo Part 3

In Part 1 we talked about why we switched from Google Analytics to Matomo. In Part 2, we discussed how we designed the architecture. Finally, here in Part 3 we will look at the Matomo specific changes necessary to support our architecture.

First, we modified the Dockerfile so that we could run commands as part of the container startup. This allows classdojo_entrypoint.sh to run, but the process that the container ultimately creates is the long running apache2-foreground:

1# The matomo version here must exactly match the version in the matomo_plugin_download.sh script
2FROM matomo:4.2.1
3ADD classdojo_entrypoint.sh /classdojo_entrypoint.sh
4ADD ./tmp/SecurityInfo /var/www/html/plugins/SecurityInfo
5ADD ./tmp/QueuedTracking /var/www/html/plugins/QueuedTracking
6ADD ./tmp/dbip-city-lite-2021-03.mmdb /var/www/html/misc/DBIP-City.mmdb
7RUN chmod +x /classdojo_entrypoint.sh
8ENTRYPOINT ["/classdojo_entrypoint.sh"]
9CMD ["apache2-foreground"]

Next, we wrote a script to download plugins and geolocation data, to bake into the Docker image:

1#!/bin/sh
2set -e
3
4MATOMO_VERSION="4.0.2"
5
6rm -rf ./tmp
7mkdir ./tmp/
8cd ./tmp/
9
10# This script downloads and unarchives plugins. These plugins must be activated in the running docker container
11# to function, which happens in matomo_plugin_activate.sh
12curl -f https://plugins.matomo.org/api/2.0/plugins/QueuedTracking/download/${MATOMO_VERSION} --output QueuedTracking.zip
13unzip QueuedTracking.zip -d .
14rm QueuedTracking.zip
15curl -f https://plugins.matomo.org/api/2.0/plugins/SecurityInfo/download/${MATOMO_VERSION} --output SecurityInfo.zip
16unzip SecurityInfo.zip -d .
17rm SecurityInfo.zip
18
19curl -f https://download.db-ip.com/free/dbip-city-lite-2021-03.mmdb.gz --output dbip-city-lite-2021-03.mmdb.gz
20gunzip dbip-city-lite-2021-03.mmdb.gz
21
22cd ..

Then we write the entrypoint file itself. Since we overwrote the original entrypoint, our entrypoint needs to unpack the Matomo image and fix some permissions first, but then we activate plugins that we want to include:

1#!/bin/sh
2set -e
3
4if [ ! -e matomo.php ]; then
5 tar cf - --one-file-system -C /usr/src/matomo . | tar xf -
6 chown -R www-data:www-data .
7fi
8
9mkdir -p /var/www/html/tmp/cache/tracker/
10mkdir -p /var/www/html/tmp/assets
11mkdir -p /var/www/html/tmp/templates_c
12chown -R www-data:www-data /var/www/html
13find /var/www/html/tmp/assets -type f -exec chmod 644 {} \;
14find /var/www/html/tmp/assets -type d -exec chmod 755 {} \;
15find /var/www/html/tmp/cache -type f -exec chmod 644 {} \;
16find /var/www/html/tmp/cache -type d -exec chmod 755 {} \;
17find /var/www/html/tmp/templates_c -type f -exec chmod 644 {} \;
18find /var/www/html/tmp/templates_c -type d -exec chmod 755 {} \;
19
20# activate matomo plugins that were downloaded and added to the image
21/var/www/html/console plugin:activate SecurityInfo
22/var/www/html/console plugin:activate QueuedTracking
23
24exec "$@"
25
26We tie it together with a Makefile to build and publish these Docker images:
27
28build-img:
29 sh ./matomo_plugin_download.sh
30 docker build . -t classdojo/matomo
31 rm -rf ./tmp
32
33push-img:
34 docker tag classdojo/matomo:latest xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:latest
35 docker tag classdojo/matomo:latest xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:${BUILD_STRING}
36 aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin xxx.dkr.ecr.us-east-1.amazonaws.com
37 docker push xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:latest
38 docker push xxx.dkr.ecr.us-east-1.amazonaws.com/classdojo/matomo:${BUILD_STRING}

Inside our Nomad job specifications, we inject a config.ini.php file. This contains the customized config.ini.php for Matomo. It is a copy of the original Matomo config.ini.php file, but with some important changes:

1[General]
2proxy_client_headers[] = "HTTP_X_FORWARDED_FOR"
3force_ssl = 1
4enable_auto_update = 0
5multi_server_environment=1
6browser_archiving_disabled_enforce = 1

Proxy_client_headers and force_ssl are used as part of our SSL setup. Enable_auto_update prevents containers from updating separately, so that we can coordinate updates across all containers. Multi_server_environment prevents plugin installation from the UI and disables UI changes that write to the config.ini.php file. Browser_archiving_disabled_enforce ensures that the archiving job is the only job that can run archiving, and that archiving won’t happen on demand.

For our non-frontend ingestion containers, we also set:

1; Maintenance mode disables the admin interface, but still allows tracking
2maintenance_mode = 1

Another major change is that the Docker command for the queue processor is changed to:

1command = "/bin/sh"
2 args = ["-c", "while true; do /var/www/html/console queuedtracking:process; done"]

This allows the job to run in a loop, continuously processing the items in the queue.

Similarly, the archive job is changed to:

1command = "/var/www/html/console"
2 args = ["core:archive"]

Which runs the archiving job directly. The admin and ingestion containers all use the default docker command and arguments.

That’s the end of our current journey from Google Analytics to Matomo. There’s more work we have to do around production monitoring and making upgrades easier, but we’re very happy with the performance of Matomo at our scale, and its ability to grow with ClassDojo.

Dominick Bellizzi

ClassDojo's CTO, Dom enjoys refactoring, continuous delivery, and long walks on the beach with potential engineering candidates. He previously co-founded Wikispaces.

    Next Post
    Previous Post