TL;DR

This post exists because I forgot how to start locally installed Postgres before a talk.

I don’t normally have a section like this, but since I’ll probably be the one referencing this post again in the future:

apt install postgres # or whatever for your distro
initdb --no-locale -D <data-dir>
pg_ctl -D <data-dir> -l <log-file> start

on MacOS:

brew install postgresql
brew services start postgres
psql postgres

Context

Every once in a while you forget how to use a tool and are humbled by how much of a craftsman you aren’t.

I can’t tell if it’s hard or if I just don’t like Postgres as much as I thought I did, but before a talk I had to give on some code I wrote to interact with Postgres (in WebAssembl – a discussion for another time), but before the taslk I couldn’t start a local Postgres installation.

The talk went well, without the demo I wanted to give, but I have no one to blame (not even MacOS) – even if Docker localhost networking (and the contortions that MacOS has to go through – I use lima) wasn’t working properly/as I expected, I should have known how to start postgres with the right binaries installed (thanks to brew).

My first thought after not being able to connect to the correctly-configured, running, MacOS postgres docker container was:

“Oh no problem I’ll just spin up Postgres myself right, how hard could it be? I remember I need to initdb then actually run it and the command to run the db is different than the control command right…”

I couldn’t do it in what felt like ~15 minutes of trying.

What is Postgres?

Postgres is a popular F/OSS database. It’s mascot is an elephant called Slonik, and it supports querying data with Structured Query Language (SQL), and it’s one of the most successful and useful open source projects out there.

If somehow you’ve gotten here because someone sent you a link and you don’t know what Postgres is, I’ll sum it up in a bunch of bullet points – from the basic premise of why databases are needed to what Postgres is.

If you don’t know what a computer program is, this probably isn’t the post for you.

Programs need to store data (writing to files on disk works OK)
Programs need to retrieve data (reading files on disk works OK)
Programs usually want to retrieve data in very specific pattern, and often only partial bits of data
Some data is related to other data, and it’s useful to take some chunks of data with other chunks of data.

Postgres solves the pain points above – rather than cooking up your own flat file schema with JSON for content or whatever you might write, Postgres provides a “database system” – you give Postgres your data, Postgres writes it to disk and makes it easy to get little bits of data in and out.

Yes, `docker` is the easiest way to do this

Remember when people thought Docker was bad/more complex than most alternative or a HackerNews new-and-shiny fad? Those were the days.

Assuming you have Docker installed (caveat is bigger on some systems than others), then it’s absolutely the easiest way:

docker run \
    --rm \
    -p 5432:5432 \
    -e POSTGRES_PASSWORD=postgres \
    --name pg \
    postgres:16.4-alpine

Easy peasy, you have (and are watching the output of) a Postgres database that:

Has a root username and password of postgres:postgres
Is ephemeral (i.e. it will lose all data it was storing after exiting – i.e. you hit Ctrl-C)
Will remove itself as a docker container after it stops
Is bound to port localhost:5432 (and all addresses 0.0.0.0:5432, so be careful!)
Has the container name pg

If you have psql or an application that uses a database, that’s all you need to do. Heck even if you don’t have psql installed, you can docker exec it:

docker exec -it pg psql --user postgres

That’s all it takes, to get yourself a console into the (a password isn’t required because the connection is “local” inside the container):

psql (16.4)
Type "help" for help.

postgres=#

Anyway – 99% of the time if your system has Docker, this is what you should be doing locally, in tests, etc to save yourself time and also isolate a little bit. This post isn’t about Postgres on Docker though so I’ll keep moving.

Installing Postgres locally

OK, obviously before you can use a locally installed postgres, you need to locally install Postgres.

On MacOS, this was easy thanks to brew:

brew install postgresql

Since I don’t daily-drive MacOS, let’s go with the easiest way to install on Debian-based distros – apt:

apt upgrade -y # optional
apt install postgres

This will install a bunch of binaries that you’ll need to do things (which you might not have known otherwise):

postgres
pg_* (importantly, pg_ctl, pg_dump, and some others)

Let’s start the system Postgres instance

Alright, it’s installed, time to just let ’er rip right? If you were in a container and tried to run postgres you’d see the following:

"root" execution of the PostgreSQL server is not permitted.
The server must be started under an unprivileged user ID to prevent
possible system security compromise.  See the documentation for
more information on how to properly start the server.

OK, so this is mostly becuase I’m running in a container which by default has be running as root, let’s use a regular user:

root@521dad69d81a:/# su postgres
postgres@521dad69d81a:/$

Where did the postgres user come from, you ask? Well, it got created when apt install postgres was run. If you’re worried by this, welcome to the world – make sure you trust your distro & maintainers.

Go back and read the output of initdb!

The point is, running postgres isn’t enough to start a Postgres server. So let’s try again:

postgres@521dad69d81a:/$
postgres: could not access the server configuration file "/var/lib/postgresql/data/postgresql.conf": No such file or directory

Right, so here’s a real problem – there are some pre-existing places that should exist, for the most standard system based installation.

The real problem is that ~~we haven’t read the fucking manual on how to use this technology~~ we haven’t done the basic amount of configuration, which… is required.

I won’t get into why it’s looking in /var/lib/postgresql/data or what’s supposed to be in postgresql.conf but you should read the manual for that.

Not so fast: Configuring Postgres so we can start it

OK, so what’s the minimum amount of config you need to start a local Postgres instance?

Well, what ou need to do is initialize a DB:

initdb

Yup, not pg_initdb or anything like that – apt install postgres also installed this binary, which initializes a Postgres database. It was a different time.

So here’s the output you’d see from that:

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /var/lib/postgresql/data -l logfile star

So, there are a couple things that are/must be true here:

initdb initialized a database, by default at /var/lib/postgresql/data
postgres has permissions to access that system directory (it’s not all just root!)

Starting the database

Now can we run postgres ?

postgres@521dad69d81a:/$ postgres
2024-08-31 12:29:56.674 UTC [2925] LOG:  starting PostgreSQL 16.4 (Debian 16.4-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-08-31 12:29:56.674 UTC [2925] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2024-08-31 12:29:56.674 UTC [2925] LOG:  listening on IPv6 address "::", port 5432
2024-08-31 12:29:56.681 UTC [2925] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-31 12:29:56.690 UTC [2928] LOG:  database system was shut down at 2024-08-31 12:27:29 UTC
2024-08-31 12:29:56.697 UTC [2925] LOG:  database system is ready to accept connection

Alright, we’re in.

But wait, there’s more – the really keen readers will have noticed that this actually isn’t the suggested way to start postgres. The instructions at the end of initdb output told you another way (that doesn’t hog your shell and run forever):

pg_ctl -D /var/lib/postgresql/data -l logfile start

What this does is starts your postgres instance in the background and writes the logs to logfile (which you should change).

Attempting to connect to our instance

Alright, so how do we actually connect to this running instance? First we’ll need another terminal, and psql:

➜ docker exec -it pg16 bash
root@521dad69d81a:/# psql -U postgres
psql (16.4 (Debian 16.4-1.pgdg110+1))
Type "help" for help.

postgres=#

Why postgres ? Well, by default postgres creates a user for the user that started the process. We didn’t have to set a password (there isn’t one set) – but we can connect because same-machine connections are trusted by default:

And here’s what we get when we list the users:

postgres=# \du
                             List of roles
 Role name |                         Attributes
-----------+------------------------------------------------------------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RL

But what if we don’t want just one instance

What do you do if you want to just start an instance of Postgres but you want an ephemeral one – i.e. one that isn’t shared by everyone on your system (@ /var/lib/postgresql/data)?

Well, we’ll need to actually pass arguments instead of hopes and prayers to initdb and postgres:

mkdir /tmp/pg-data-12345
initdb --pgdata=/tmp/pg-data-12345

How did I know this? Well initdb --help is a great place to start, and the Postgres manual is an even better one.

You can imagine, we get all the same output, but with the data directory somewhere else.

This time, let’s use pg_ctl like ~~God~~ initdb intended:

pg_ctl -D /tmp/pg-data-12345 -l /tmp/pg-data-12345.log start

What actually went wrong at the talk?

Thinking back, honestly I have no idea.

MacOS is a little bit different (tm) – brew install postgres is a little different. It boils down to:

brew install postgresql
brew services start postgres
psql postgres

I remember trying to connect locally yet not being able to, but it’s all gone now. Fog of war, I guess – demos should be practiced on the machine they’re going to be running on long before a talk.

Writing it all now, it’s all so simple – either way I guess now I have a blog post and a reference.

VADOSWARE

Living in a yak shaver's paradise.

Starting locally installed Postgres instances in 2024

Categories

Table of Contents

TL;DR

Context

What is Postgres?

Yes, `docker` is the easiest way to do this

Installing Postgres locally

Let’s start the system Postgres instance

Not so fast: Configuring Postgres so we can start it

Starting the database

Attempting to connect to our instance

But what if we don’t want just one instance

What actually went wrong at the talk?

Categories

Table of Contents

TL;DR

Context

What is Postgres?

Yes, docker is the easiest way to do this

Installing Postgres locally

Let’s start the system Postgres instance

Not so fast: Configuring Postgres so we can start it

Starting the database

Attempting to connect to our instance

But what if we don’t want just one instance

What actually went wrong at the talk?

Yes, `docker` is the easiest way to do this