Table of Contents
Google Analytics is the de facto standard for web analytics. It provides many advanced features such as conversion tracking and real-time user counts. However, it's designed to track user activity across the web and can be considered an invasion of privacy. Furthermore, its use requires adding a blob of proprietary JavaScript to your website.
I would argue that for most website owners, the only metrics that are truly valuable is the number of users, the pages they visited and where they came from. It just so happens that the information in the NGINX access log can be harnessed to generate fresh activity dashboards. GoAccess is a log file parser that can accomplish this task. With a little configuration it can provide great analytics with no user tracking and no added JavaScript.
This article will explore how this can be used with NGINX on Ubuntu. The gist of it should be the same no matter the web server or Linux distribution used.
Installing GoAccess
On Ubuntu and Debian, GoAccess can be installed through the default repositories. Simply install the tool using apt
.
sudo apt install goaccess
Once installed the goaccess
command should be available.
jamie@jmh:~$ goaccess --version
GoAccess - 1.5.5.
For more details visit: https://goaccess.io/
Copyright (C) 2009-2022 by Gerardo Orellana
Build configure arguments:
--enable-utf8
--enable-geoip=mmdb
--with-openssl
Alternatively, you can download the latest version from a tar archive. Simply follow the installation instructions.
Generating a dashboard
By default, NGINX writes the access log for all sites to /var/log/nginx/access.log
. If you've changed this path, you can simply use that instead.
To generate an HTML dashboard using the log file, use the following command:
goaccess -c /var/log/nginx/access.log -o dashboard.html --log-format=COMBINED
This creates a nice looking dashboard, dashboard.html
in this case, that can be viewed in a web browser. Just like the GoAccess demo.
You can tweak the generated panels using the GoAccess configuration file located at /etc/goaccess/goaccess.conf
by changing the enable-panel
and ignore-panel
directives.
Creating a password-protected page
Generating a file locally on the server isn't particularly useful by itself so, let's create a password protected subdirectory on our site to view it.
Creating the web root
Let's create a root directory for our dashboard and change the owner to the web server user and group.
sudo mkdir /var/www/goaccess
sudo chown www-data:www-data /var/www/goaccess
Generating a .htpasswd file
In order to add a password to our directory, we'll need to generate an Apache .htpasswd file with our credentials.
You will need the apache2-utils
. It can be installed using the following command:
sudo apt install apache2-utils
Let's create the file in the NGINX etc
directory to keep it safe. You will be prompted for a password when running this command.
sudo htpasswd -c /etc/nginx/.htpasswd-goaccess [username]
sudo chown www-data:www-data /etc/nginx/.htpasswd-goaccess
This has created a password file at /etc/nginx/.htpasswd-goaccess
.
Updating our host configuration
Update your site configuration file stored in /etc/nginx/sites-available
with the following location
block.
location ^~ /goaccess {
alias /var/www/goaccess;
index index.html;
auth_basic "Login";
auth_basic_user_file /etc/nginx/.htpasswd-goaccess;
}
This will allow us to navigate to https://website/goaccess
to view the contents of the /var/www/goaccess/
directory. It's protected by basic HTTP authentication using the .htpasswd file we generated.
Automating the generation
At the moment, there is no dashboard at this location. Let's automate the generation with a cron
job to get up-to-date analytics.
We should generate the file using the www-data
user, the default web server user on Ubuntu since it can read the logs and write to the web root.
sudo crontab -e -u www-data
Add the following line to generate the file every 10 minutes. Change the cron
expression as required.
*/10 * * * * goaccess -c /var/log/nginx/access.log -o /var/www/goaccess/index.html --log-format=COMBINED
Now, we can access our analytics from the goaccess
path on our web root. Keep in mind, the larger the file, the longer it will take to process. For most sites, this should not be an issue as GoAccess is very fast.
Increasing the depth
On Ubuntu, the access logs are rotated daily. This doesn't provide a lot of depth for analytics. By configuring logrotate
we can generate dashboards with a few days of history.
By default, the logrotate
configuration at /etc/logrotate.d/nginx
looks like this:
/var/log/nginx/*.log {
daily # replace with weekly or monthly
missingok
rotate 2
compress
delaycompress
notifempty
create 0640 www-data adm
sharedscripts
prerotate
if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
run-parts /etc/logrotate.d/httpd-prerotate; \
fi \
endscript
postrotate
invoke-rc.d nginx rotate >/dev/null 2>&1
endscript
}
We can replace daily
by monthly
to generate a month's worth of history. Keep in mind, this affects all the NGINX log files in this case, you may want to add another block just for the access logs depending on the size of these files.
Keeping previous months of analytics
With logrotate
, we can archive previous months analytics by making a copy of the latest HTML file. First, enable indexing on our GoAccess directory via the NGINX site config so that the files can be listed from the root.
location ^~ /goaccess {
...
autoindex on;
...
}
Next, modify the command called by cron
to generate a file called latest.html
instead of index.html
. This will allow us to list the files at the root instead of serving the index file.
*/10 * * * * goaccess -c /var/log/nginx/access.log -o /var/www/goaccess/latest.html --log-format=COMBINED
In our logrotate
config, we can simply invoke a rotation of our GoAccess dashboard in the postrotate
script. The resulting file will contain the month and the year in the name.
/var/log/nginx/*.log {
...
postrotate
invoke-rc.d nginx rotate >/dev/null 2>&1
cp /var/www/goaccess/latest.html /var/www/goaccess/$(date +"%m-%Y" -d "1 day ago").html
endscript
}
Conclusion
In conclusion, GoAccess is a great tool to harness the analytics data we already have. It requires no invasive JavaScript on the client side and does not slow the website down. It cannot compete with the feature richness of lets say Google Analytics, but it provides more than enough information for most websites without any user tracking.