Production Use - Good Practice
(1Q20)
Based on our experiences of developing and using eXist-db in production environments we learned a number of lessons. This Good Practice guide is an attempt to cover some of the considerations that should be taken into account when deploying eXist-db for use in a production environment.
The concepts laid out within this document should not be considered absolute or accepted wholesale - they should rather be used as suggestions to guide users in their eXist-db deployments.
The Server
-
Ensure that your server is up-to-date and patched with any necessary security fixes.
-
eXist-db is written in Java - so for performance and security reasons, please ensure that you have the latest and greatest Java JDK release that is compatible with your version of eXist. The latest version can always be found here at: http://java.sun.com and the recommended major version for a given eXist release can be found at: https://bintray.com/existdb/releases/exist#read
-
For dockerized production systems we strongly recommend to use semantic-version tags of official release, e.g.
5.3.0
instead ofrelease
orlatest
.
Install from Source or Release?
Most users will install an officially released version of eXist-db on their production systems and usually this is perfectly fine. However, for production systems there can be advantages to installing eXist-db from source code.
eXist-db may be installed and build (see Building eXist-db) from source code to a production system in one of two ways:
- Via Local Build Machine (preferred)
-
You checkout the eXist-db code for a release branch (or trunk) from our GitHub repository to a local machine. From here you build a distribution which you test and then deploy to your live server.
- Directly from GitHub
-
In this case you don't use a local machine for building an eXist-db distribution, but you checkout the code from a release branch (or the develop branch) directly from our GitHub repository on your server and build it in-situ.
Some advantages of installing eXist-db from source code are:
- Patches
-
If patches or fixes are developed that are relevant to your specific needs, you can update your code and re-build eXist.
- Features
-
If you are following trunk and new features are developed which you are interested in, you can update your code and re-build to take advantage of these.
Warning:
eXist's code trunk is generally not recommended for production use! Although it should always compile and be relatively stable, it may also contain as yet unrecognised regressions or result in unexpected behaviour.
Upgrading
If you are upgrading the version of eXist-db that you use in your production system, please always follow these two points:
- Backup
-
Always make sure you have a full database backup before you upgrade.
- Test
-
Always test your application in the new version of eXist-db in a development environment to ensure expected behaviour before you upgrade your production system.
Configuring eXist
There are four main things to consider here:
- Security - Permissions
-
Ensure that eXist-db is installed in a secure manner.
- Security - Attack Surface
-
Configure eXist-db so it provides only what you need for your application.
- Resources
-
Configure your system and eXist-db so that eXist-db has access to enough resources and the system starts and stops eXist-db in a clean manner.
- Performance
-
Configure your system and eXist-db so that you get the maximum performance possible.
Security - Permissions
The permission requirements for development and deployment servers are rather different. Here we explain what to look out for starting with the default configuration.
eXist-db Permissions
eXist-db ships with fairly relaxed permissions to facilitate rapid application development. However, for production systems these should be constrained:
admin
account-
The password of the admin account is blank by default! Ensure that you set a decent password.
default permissions
-
The default permissions for creating resources and collections in eXist-db are
0666
for resources, and0777
for collections. From these default permissions, the user's umask is subtracted to give the permissions assigned to new resources and collections. By default each new user has the umask022
, which leads to new resources having the mode0644
, and collections0755
. You may wish to modify the umask of some of your users to further restrict the default permisions when they create resources and collections. /db
permissions-
The default permissions for
/db
are0755
, which should be sufficient in most cases. In case you needed to change this, you could do that with (here for0775
):sm:chmod(xs:anyURI("/db"), "rwxrwxr-x")
Operating System Permissions
eXist-db should be deployed and configured to run whilst following the security best practices of the operating system on which it is deployed.
Typically we would recommend creating an exist
user account and
exist
user group with no login privileges (no
shell and empty password), changing the permissions of the eXist-db installation
to be owned by that user and group. Then run eXist-db using those credentials.
An example of this on OpenSolaris might be:
$ pfexec groupadd exist
$ pfexec useradd -c "eXist Native XML Database" -d /home/exist -g exist -m exist
$ pfexec chown -R exist:exist /opt/eXist
Security - Attack Surface
For any live application it is best practice to keep the attack surface of the application as small as possible. There are three aspects to this:
-
Limiting means of arbitrary code execution.
-
Reducing the application itself to the absolute essentials.
-
Limiting access routes to the application.
eXist-db is no exception and should be configured for your production systems so that it provides only what you need and no more. For example, the majority of applications will be unlikely to require the WebDAV or SOAP Admin features for operation in a live environment. These and other services can be disabled easily.
Means for anonymous users to execute arbitrary code require special attention. There are three means of code execution in eXist, which make sense during development, but should be reconsidered for production systems:
- Java binding
-
The ability to execute java code from inside the XQuery processor is disabled by default in
conf.xml
:<xquery enable-java-binding="no"/>
It is strongly recommended to keep it disabled on production systems.
- XML external entities
-
In order to ensure a secure environment, the
external-general-entities
,external-parameter-entities
, andsecure-processing
feature flags should be set inconf.xml
:<parser> <xml> <features> <feature name="http://xml.org/sax/features/external-general-entities" value="false"/> <feature name="http://xml.org/sax/features/external-parameter-entities" value="false"/> <feature name="http://javax.xml.XMLConstants/feature/secure-processing" value="true"/> </features> </xml> </parser>
- REST server
-
We recommend to prevent eXist's REST server from directly receiving web requests, and use URL Rewriting only to control code execution. The REST server feature is enabled by default in
$EXIST_HOME/etc/webapp/WEB-INF/web.xml
. Changing theparam-value
totrue
, allows you to filter request via your own XQuery controller.<init-param> <param-name> hidden </param-name> <param-value> true </param-value> </init-param>
The following options allow a more fine-grained control over the REST server's functionality:
- XQuery submissions
-
We recommend to restrict the REST servers ability to execute XQuery code to authenticated users, by modifying:
$EXIST_HOME/etc/webapp/WEB-INF/web.xml
:<init-param> <param-name> xquery-submission </param-name> <param-value> authenticated </param-value> </init-param>
- XUpdate statements
-
In addition, we recommend to restrict the REST servers ability to execute XUpdate statements. Simply modify
$EXIST_HOME/etc/webapp/WEB-INF/web.xml
by changing theparam-value
fromenabled
todisabled
:<init-param> <param-name> xupdate-submission </param-name> <param-value> disabled </param-value> </init-param>
Further considerations for a live environment:
- Services
-
eXist-db provides services for accessing the database. You should reduce these to the absolute minimum you need for your production application.This is done via
etc/webapp/WEB-INF/web.xml
. You should look at each configured service, servlet or filter and ask yourself: do we use this? Most production environments are unlikely to need WebDAV. - Extension Modules
-
eXist-db loads several XQuery and Index extension modules by default. You should modify the
<builtin-modules>
section ofetc/conf.xml
and only load what you need for your application.If you make use of the Cache Module, you should make sure that it has either a
maximumSize
orexpireAfterAccess
bound configured, this ensures that the Cache can consume all memory.
Resources
You should ensure that you have enough memory and disk space in your system so that eXist-db can cope with peak demands.
-Xmx
-
However you decide to deploy and start eXist, please ensure that you allocate enough maximum memory to eXist-db uwing the Java
-Xmx
setting. Seebin/backup.sh
andbin/startup.sh
. cacheSize
andcollectionCache
-
These two settings in
<db-connection>
ofetc/conf.xml
should be adjusted appropriately based on your-Xmx
setting (see above). See the tuning guide for advice on sensible values. - Disk space
-
Please ensure that you have plenty of space for your database to grow. Unsurprisingly, running out of disk space can result in database corruptions or having to rollback the database to a known state.
Performance
Keeping the eXist-db application, data and journal on separate disks, connected to
different I/O channels, can have a positive impact on performance. The location of
the data files and journals can be changed in
etc/conf.xml
.
In addition to gain the absolute best performance, for eXist-db 5.0.0 or newer, it
may be beneficial to disable Lock Event Tracking in the Lock Table. The Lock Table
can be disabled in the etc/conf.xml
configuration file.
Backups
This is fundamental: Make sure you have them, that they are up-to-date and that a restore is possible!
eXist-db provides 3 different mechanisms for performing backups -
-
Full database backup.
-
Differential database backup.
-
Snapshot of the database data files.
Each of these backup mechanisms can be scheduled, either with eXist-db or with your
operating system scheduler. See the backup article and
conf.xml
for further details.
Web Deployments
eXist-db, like any Web Application Server, should not be directly exposed to the Web. Instead, we strongly recommend proxying eXist-db through a Web Server such as Nginx or Apache HTTPD. See here for further details.
If you proxy eXist-db through a Web Server, you can also configure your firewall to allow external access directly to the Web Server only. If done correctly this means that web users will not be able to access any eXist-db services directly, except your application, which is proxyied into the Web Servers namespace.
Enable GZip Compression
eXist-db by default operates inside the Jetty Application Server. Jetty (and most other Java Application Servers) provides a mechanism for enabling dynamic GZip compression of resources. In other words: Jetty can be configured to dynamically GZip compress any resource received from the server by HTTP. Enabling dynamic GZip compression can reduce the size of transfers, and as such reduce the transfer time of resources from the server to the client, hopefully resulting in a faster experience for the end-user.
GZip Compression can be enabled in web.xml
, which can be found in
either $EXIST_HOME/etc/webapp/WEB-INF/web.xml
for default deployments
or $EXIST_HOME/etc/jetty/standalone/WEB-INF/web.xml
for standalone
deployments.