Understanding Upstream vs. Downstream PostgreSQL

When you install PostgreSQL on a Raspberry Pi or on your favorite Linux distribution, the very first surprise often comes right after the installation finishes: the configuration files are not where the official documentation says they should be. Instead of finding postgresql.conf inside the data directory, you discover that Debian places it under /etc/postgresql/<version>/<cluster>/. To make sense of this, we need to step back and explore the relationship between upstream PostgreSQL and the downstream packages provided by distributions.

What “Upstream PostgreSQL” Really Means

The PostgreSQL Global Development Group – usually referred to simply as “PostgreSQL upstream” – is responsible for writing the actual database server. Upstream publishes the official source code tarballs (e.g. postgresql-17.0.tar.gz) whenever a new release comes out. These tarballs are the canonical PostgreSQL distribution. If you compile PostgreSQL yourself from these sources, you get what the developers consider the “default layout”: binaries installed under /usr/local/pgsql/bin/, the data directory initialized wherever you choose, and all configuration files (postgresql.confpg_hba.confpg_ident.conf) living inside that data directory. This is the model you see described in most PostgreSQL books and tutorials.

Docker as an Upstream Experience

Upstream also maintains the official Docker images on Docker Hub. The postgres:17 image, for example, is a multi-architecture build that runs on both x86_64 servers and ARM devices such as the Raspberry Pi. Inside the container, PostgreSQL follows the upstream defaults: configuration lives in the data directory (/var/lib/postgresql/data) and the server is started with postgres as the entrypoint. If you are running PostgreSQL in Docker, you are very close to an upstream experience, only wrapped into a container.

What Happens When Distributions package PostgreSQL

Things look different when you install PostgreSQL using the package manager of your Linux distribution. Debian and Ubuntu, for instance, do not ship PostgreSQL in its raw upstream form. Instead, their package maintainers take the official source, compile it for the distribution’s architecture, and integrate it into the distribution’s conventions. This downstream packaging introduces several changes that can be surprising if you only know the upstream defaults.

Debian uses the concept of “clusters,” where each instance of PostgreSQL (defined by version plus data directory) is treated as a separate unit. The configuration files for each cluster live under /etc/postgresql/<version>/<cluster>/, while the actual data directory is placed under /var/lib/postgresql/<version>/<cluster>/. Logs often go to /var/log/postgresql/. Systemd services follow the same idea: instead of a single postgresql.service, Debian provides templated units such as postgresql@17-main.service. This naming convention makes it easy to run multiple PostgreSQL versions or clusters side by side, but it can be bewildering for users expecting the simpler upstream layout.

The Bigger Picture: Upstream and Downstream Roles

So the landscape looks like this: upstream delivers the code and the official Docker images; distributions like Debian, Ubuntu, or Red Hat build binary packages and impose their own directory structures and service management approaches. When you install from source or run the upstream Docker image, configuration files reside in the data directory. When you install via your distribution’s package manager, you need to look under /etc for configuration, /var/lib for data, and /var/log for logs.

Why It Matters in Practice

Understanding this distinction between upstream and downstream PostgreSQL is essential if you move between environments. On a Raspberry Pi where you install with apt-get, you must edit /etc/postgresql/17/main/postgresql.conf to change the listening addresses. In a Docker container, the same change is made inside /var/lib/postgresql/data/postgresql.conf. Neither approach is “wrong.” They simply reflect two different ways the PostgreSQL ecosystem distributes its software: the upstream way, which provides only the source and an official container image, and the downstream way, where Linux distributions adapt PostgreSQL to fit into their operating system conventions.

Summary

Once you see PostgreSQL through this lens, the pieces fall into place. Upstream is the clean, canonical PostgreSQL as the developers envision it. Downstream is how distributions shape it to feel like a natural citizen of their OS. As a user, the key is to know which world you are in – so that when you go hunting for postgresql.conf, you know exactly where to look.

Thanks for your time,

-Klaus

Leave a Comment

Your email address will not be published. Required fields are marked *

Do you want to master PostgreSQL like an expert?

PostgreSQL for the SQL Server Professional

Live Training on November  26 – 27 for only EUR 1790 incl. 20% VAT