Wednesday, December 15, 2021

Install DHIS2 on UBUNTU 20.04LTS

 Install DHIS2 Production instance on UBUNTU 20.04 LTS

This document is based on the DHIS2 installation guide on dhi2.org with my style and resources. Your questions can send to me or in the DHIS2 Community. 

Server specifications

DHIS2 is a database intensive application and requires appropriate amount of RAM, number of CPU cores and a fast disk. These recommendations should be considered as rules-of-thumb and not exact measures. DHIS2 scales linearly on the amount of RAM and number of CPU cores so the more you can afford, the better the application will perform.

RAM: At least 1 GB memory per 1 million captured data records per month or per 1000 concurrent users. At least 4 GB for a small instance, 12 GB for a medium instance.

CPU cores: 4 CPU cores for a small instance, 8 CPU cores for a medium or large instance.

Disk: Ideally use an SSD. Otherwise use a 7200 rpm disk. Minimum read speed is 150 Mb/s, 200 Mb/s is good, 350 Mb/s or better is ideal. In terms of disk space, at least 60 GB is recommended, but will depend entirely on the amount of data which is contained in the data value tables. Analytics tables require a significant amount of disk space. Plan ahead and ensure that your server can be upgraded with more disk space as it becomes needed.

In this writeup I use 4 Processor, 16GB RAM and 250GB Disk space for a small instance.

Software requirements

An operating system for which a Java JDK or JRE version 8 or 11 exists. Linux is recommended.

Java JDK. OpenJDK is recommended.

For DHIS 2 version 2.35 version and later, JDK 11 is recommended and JDK 8 or later is required.

For DHIS 2 versions older than 2.35, JDK 8 is required.

PostgreSQL database version 9.6 or later. A later PostgreSQL version such as version 13 is recommended.

PostGIS database extension version 2.2 or later.

Tomcat servlet container version 8.5.50 or later, or other Servlet API 3.1 compliant servlet containers.

Server setup

First, creating a user to run DHIS2

You should create a dedicated user for running DHIS2. You should not run the DHIS2 server as a privileged user such as root.

Create a new user called dhis by invoking:

sudo useradd -d /home/dhis -m dhis -s /bin/false

Then to set the password for your account invoke:

sudo passwd dhis

Make sure you set a strong password with at least 15 random characters.

Creating the configuration directory

Start by creating a suitable directory for the DHIS2 configuration files. This directory will also be used for apps, files and log files. An example directory could be:

mkdir /home/dhis/config

chown dhis:dhis /home/dhis/config

DHIS2 will look for an environment variable called DHIS2_HOME to locate the DHIS2 configuration directory. This directory will be referred to as DHIS2_HOME in this installation guide. We will define the environment variable in a later step in the installation process.

Setting server time zone and locale

It may be necessary to reconfigure the time zone of the server to match the time zone of the location which the DHIS2 server will be covering. If you are using a virtual private server, the default time zone may not correspond to the time zone of your DHIS2 location. This is very critical for postgres. You can easily reconfigure the time zone by invoking the below and following the instructions.

sudo dpkg-reconfigure tzdata

Select 'Asia' and 'Dhaka'



PostgreSQL is sensitive to locales so you might have to install your locale first. To check existing locales and install new ones (e.g. English, US):

locale -a

You can change by 

sudo locale-gen nb_NO.UTF-8


PostgreSQL installation

Ubuntu 20.04 comes with Postgres 12 from it’s universe repository. Since we want version 13, we can directly use the PostgreSQL project’s official APT repository. This repository contains binaries for Ubuntu 20.04, and also includes packages for various extensions that you need.

Let’s setup the repository like this (note that “focal” is the code name for Ubuntu 20.04):

sudo nano /etc/apt/sources.list.d/pgdg.list

Add the following line:

deb http://apt.postgresql.org/pub/repos/apt/ focal-pgdg main

Get the signing key and import it


wget https://www.postgresql.org/media/keys/ACCC4CF8.asc

sudo apt-key add ACCC4CF8.asc


Fetch the metadata from the new repo

sudo apt-get update

We can now install the PostgreSQL server and other command-line tools using:

sudo apt-get install -y postgresql-13 postgresql-13-postgis-2.4

Create a non-privileged user called dhis by invoking:

sudo -u postgres createuser -SDRP dhis

Enter a secure password at the prompt. Create a database by invoking:

sudo -u postgres createdb -O dhis dhis2

Return to your session by invoking exit You now have a PostgreSQL user called dhis and a database called dhis2.

The PostGIS extension is needed for several GIS/mapping features to work. DHIS 2 will attempt to install the PostGIS extension during startup. If the DHIS 2 database user does not have permission to create extensions you can create it from the console using the postgres user with the following commands:

sudo -u postgres psql -c "create extension postgis;" dhis2

Exit the console and return to your previous user with \q followed by exit.

PostgreSQL performance tuning

Tuning PostgreSQL is necessary to achieve a high-performing system but is optional in terms of getting DHIS2 to run. PostgreSQL is configured and tuned through the postgresql.conf file which can be edited like this:

sudo nano /etc/postgresql/10/main/postgresql.conf

and set the following properties:

max_connections = 100

Determines maximum number of connections which PostgreSQL will allow.

shared_buffers = 3200MB

Determines how much memory should be allocated exclusively for PostgreSQL caching. This setting controls the size of the kernel shared memory which should be reserved for PostgreSQL. Should be set to around 40% of total memory dedicated for PostgreSQL.

work_mem = 20MB

Determines the amount of memory used for internal sort and hash operations. This setting is per connection, per query so a lot of memory may be consumed if raising this too high. Setting this value correctly is essential for DHIS2 aggregation performance.

maintenance_work_mem = 512MB

Determines the amount of memory PostgreSQL can use for maintenance operations such as creating indexes, running vacuum, adding foreign keys. Increasing this value might improve performance of index creation during the analytics generation processes.

effective_cache_size = 8000MB

An estimate of how much memory is available for disk caching by the operating system (not an allocation) and isdb.no used by PostgreSQL to determine whether a query plan will fit into memory or not. Setting it to a higher value than what is really available will result in poor performance. This value should be inclusive of the shared_buffers setting. PostgreSQL has two layers of caching: The first layer uses the kernel shared memory and is controlled by the shared_buffers setting. PostgreSQL delegates the second layer to the operating system disk cache and the size of available memory can be given with the effective_cache_size setting.

checkpoint_completion_target = 0.8

Sets the memory used for buffering during the WAL write process. Increasing this value might improve throughput in write-heavy systems.

synchronous_commit = off

Specifies whether transaction commits will wait for WAL records to be written to the disk before returning to the client or not. Setting this to off will improve performance considerably. It also implies that there is a slight delay between the transaction is reported successful to the client and it actually being safe, but the database state cannot be corrupted and this is a good alternative for performance-intensive and write-heavy systems like DHIS2.

wal_writer_delay = 10000ms

Specifies the delay between WAL write operations. Setting this to a high value will improve performance on write-heavy systems since potentially many write operations can be executed within a single flush to disk.

random_page_cost = 1.1

SSD only. Sets the query planner's estimate of the cost of a non-sequentially-fetched disk page. A low value will cause the system to prefer index scans over sequential scans. A low value makes sense for databases running on SSDs or being heavily cached in memory. The default value is 4.0 which is reasonable for traditional disks.

max_locks_per_transaction = 96

Specifies the average number of object locks allocated for each transaction. This is set mainly to allow upgrade routines which touch a large number of tables to complete.

Restart PostgreSQL by invoking the following command:

sudo /etc/init.d/postgresql restart

Java installation

The recommended Java JDK for DHIS 2 is OpenJDK 11. OpenJDK is licensed under the GPL license and can be run free of charge. You can install it with the following command:

sudo apt-get install openjdk-11-jdk

java -version

DHIS2 configuration

The database connection information is provided to DHIS2 through a configuration file called dhis.conf. Create this file and save it in the DHIS2_HOME directory. As an example this location could be:

/home/dhis/config/dhis.conf

A configuration file for PostgreSQL corresponding to the above setup has these properties:

# JDBC driver class { #jdbc-driver-class } 

connection.driver_class = org.postgresql.Driver

# Database connection URL { #database-connection-url } 

connection.url = jdbc:postgresql://localhost:5432/dhis2

# Database username { #database-username } 

connection.username = dhis

# Database password { #database-password } 

connection.password = xxxx

# ---------------------------------------------------------------------- { #- } 

# Server { #server } 

# Enable secure settings if deployed on HTTPS, default 'off', can be 'on' { #enable-secure-settings-if-deployed-on-https-default-off-can-be-on } 

server.https = on 

# Server base URL { #server-base-url } 

server.base.url = https://myserver.com/dhis2 

It is strongly recommended to enable the server.https setting and deploying DHIS 2 with an encrypted HTTPS protocol. This setting will enable e.g. secure cookies. HTTPS deployment is required when this setting is enabled.

The server.base.url setting refers to the URL at which the system is accessed by end users over the network.

Note that the configuration file supports environment variables. This means that you can set certain properties as environment variables and have them resolved, e.g. like this where DB\_PASSWD is the name of the environment variable:

connection.password = ${DB_PASSWD}

Note that this file contains the password for your DHIS2 database in clear text so it needs to be protected from unauthorized access. To do this, invoke the following command which ensures only the dhis user is allowed to read it:

chmod 600 dhis.conf

Tomcat and DHIS2 installation

To install the Tomcat servlet container we will use Tomcat service package by invoking:

sudo apt-get install tomcat9

This package lets us easily create a Tomcat9 service. The instance will be created in the / directory. 

Next edit the file tomcat-dhis/bin/setenv.sh and add the lines below. The first line will set the location of your Java Runtime Environment, the second will dedicate memory to Tomcat and the third will set the location for where DHIS2 will search for the dhis.conf configuration file. Please check that the path the Java binaries are correct as they might vary from system to system, e.g. on AMD systems you might see /java-8-openjdk-amd64 Note that you should adjust this to your environment:

export JAVA_HOME='/usr/lib/jvm/java-11-openjdk-amd64/'

export JAVA_OPTS='-Xmx7500m -Xms4000m'

export DHIS2_HOME='/home/dhis/config'

export JAVA_OPTS='-Dlog4j2.formatMsgNoLookups=true'

The Tomcat configuration file is located in tomcat-dhis/conf/server.xml. The element which defines the connection to DHIS is the Connector element with port 8080. You can change the port number in the Connector element to a desired port if necessary. The relaxedQueryChars attribute is necessary to allow certain characters in URLs used by the DHIS2 front-end.


<Connector port="8080" protocol="HTTP/1.1"

  connectionTimeout="20000"

  redirectPort="8443"

  relaxedQueryChars="[]" />

The next step is to download the DHIS2 WAR file and place it into the webapps directory of Tomcat. You can download DHIS2 WAR files from the following location:


https://releases.dhis2.org/

Alternatively, for patch releases, the folder structure is based on the patch release ID in a subfolder under the main release. E.g. you can download the DHIS2 version 2.31.1 WAR release like this (replace 2.31 with your preferred version, and 2.31.1 with you preferred patch, if necessary):


wget https://releases.dhis2.org/2.33/2.33.1/dhis.war

Move the WAR file into the Tomcat webapps directory. We want to call the WAR file ROOT.war in order to make it available at localhost directly without a context path:

mv dhis.war tomcat-dhis/webapps/ROOT.war

DHIS2 should never be run as a privileged user. After you have modified the setenv.sh file, modify the startup script to check and verify that the script has not been invoked as root.

#!/bin/sh { #binsh } 

set -e


if [ "$(id -u)" -eq "0" ]; then

  echo "This script must NOT be run as root" 1>&2

  exit 1

fi

export CATALINA_BASE="/home/dhis/tomcat-dhis"

/usr/share/tomcat8/bin/startup.sh

echo "Tomcat started"

Running DHIS2

DHIS2 can now be started by invoking:

sudo -u dhis tomcat-dhis/bin/startup.sh

Important

The DHIS2 server should never be run as root or other privileged user.

DHIS2 can be stopped by invoking:

sudo -u dhis tomcat-dhis/bin/shutdown.sh

To monitor the behavior of Tomcat the log is the primary source of information. The log can be viewed with the following command:

tail -f tomcat-dhis/logs/catalina.out

Assuming that the WAR file is called ROOT.war, you can now access your DHIS2 instance at the following URL:

http://localhost:8080

File store configuration

DHIS2 is capable of capturing and storing files. By default, files will be stored on the local file system of the server which runs DHIS2 in a files directory under the DHIS2_HOME external directory location.

You can also configure DHIS2 to store files on cloud-based storage providers. AWS S3 is the only supported provider currently. To enable cloud-based storage you must define the following additional properties in your dhis.conf file:

# File store provider. Currently 'filesystem' and 'aws-s3' are supported. { #file-store-provider-currently-filesystem-and-aws-s3-are-supported } 

filestore.provider = 'aws-s3'

# Directory in external directory on local file system and bucket on AWS S3 { #directory-in-external-directory-on-local-file-system-and-bucket-on-aws-s3 } 

filestore.container = files

# The following configuration is applicable to cloud storage only (AWS S3) { #the-following-configuration-is-applicable-to-cloud-storage-only-aws-s3 } 

# Datacenter location. Optional but recommended for performance reasons. { #datacenter-location-optional-but-recommended-for-performance-reasons } 

filestore.location = eu-west-1


# Username / Access key on AWS S3 { #username-access-key-on-aws-s3 } 

filestore.identity = xxxx

# Password / Secret key on AWS S3 (sensitive) { #password-secret-key-on-aws-s3-sensitive } 

filestore.secret = xxxx

This configuration is an example reflecting the defaults and should be changed to fit your needs. In other words, you can omit it entirely if you plan to use the default values. If you want to use an external provider the last block of properties needs to be defined, as well as the provider property is set to a supported provider (currently only AWS S3).


Note

If you’ve configured cloud storage in dhis.conf, all files you upload or the files the system generates will use cloud storage.

For a production system the initial setup of the file store should be carefully considered as moving files across storage providers while keeping the integrity of the database references could be complex. Keep in mind that the contents of the file store might contain both sensitive and integral information and protecting access to the folder as well as making sure a backup plan is in place is recommended on a production implementation.

Note

AWS S3 is the only supported provider but more providers are likely to be added in the future, such as Google Cloud Store and Azure Blob Storage. Let us know if you have a use case for additional providers.

Google service account configuration

DHIS2 can connect to various Google service APIs. For instance, the DHIS2 GIS component can utilize the Google Earth Engine API to load map layers. In order to provide API access tokens you must set up a Google service account and create a private key:


Create a Google service account. Please consult the Google identify platform documentation.


Visit the Google cloud console and go to API Manager > Credentials > Create credentials > Service account key. Select your service account and JSON as key type and click Create.


Rename the JSON key to dhis-google-auth.json.


After downloading the key file, put the dhis-google-auth.json file in the DHIS2_HOME directory (the same location as the dhis.conf file). As an example this location could be:


/home/dhis/config/dhis-google-auth.json