PostgreSQL initialization fails on Kubernetes

Hi,

I’ve got an issue with initializing a PostgreSQL cluster on Kubernetes. The error message is (see more details below):

initdb: error: could not change permissions of "/stolon-data/postgres/postgresql.conf": Operation not permitted

Keeper, proxy and sentinel (v.0.16.0-pg12) are running and ready. stolonctl was successfully executed:

# kubectl exec -n postgres -ti pod/stolon-keeper-0 -- stolonctl --kube-namespace=postgres --cluster-name=prod --store-backend=etcdv3 --store-endpoints="http://etcd-appl-0.etcd-appl.basic-services.svc.cluster.local:2379,http://etcd-appl-1.etcd-appl.basic-services.svc.cluster.local:2379,http://etcd-appl-2.etcd-appl.basic-services.svc.cluster.local:2379" init

Here’s the DB initialization part from the keeper logs:

# kubectl logs -n postgres stolon-keeper-0 --follow
2020-11-30T13:56:33.545Z        ERROR   cmd/keeper.go:1063      db failed to initialize or resync
2020-11-30T13:56:33.585Z        INFO    cmd/keeper.go:1094      current db UID different than cluster data db UID       {"db": "", "cdDB": "2b49551d"}
2020-11-30T13:56:33.585Z        INFO    cmd/keeper.go:1101      initializing the database cluster
creating configuration files ... The files belonging to this database system will be owned by user "stolon".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /stolon-data/postgres ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
initdb: error: could not change permissions of "/stolon-data/postgres/postgresql.conf": Operation not permitted
initdb: removing data directory "/stolon-data/postgres"
2020-11-30T13:56:34.005Z        ERROR   cmd/keeper.go:1135      failed to initialize postgres database cluster  {"error": "error: exit status 1"}

Content of /stolon-data which is a host path mount (provided by glusterfs on OS level):

# ls -la /stolon-data
drwxr-xr-x.  2 stolon stolon   52 Nov 30 13:56 .
-rw-------.  1 stolon stolon   77 Nov 30 13:56 dbstate
-rw-------.  1 stolon stolon   41 Nov 30 12:18 keeperstate
-rw-r--r--.  1 stolon stolon    0 Nov 30 12:18 lock

I hope you’ve got some ideas what’s happening there. I’d be happy to provide more information if required.

Best regards,
Andreas

Hi,

I was trying to debug this issue further (my next steps would have been 1. disabling SELinux, 2. using the master image instead of v.0.16.0 and 3. building my own images with debug output), when I realized:

I can’t reproduce this issue anymore.
The only thing I had changed since above was deleting the statefulset and applying it again. The cluster now starts perfectly fine.

# ls -la /stolon-data/
drwxr-xr-x.  3 stolon stolon   68 Dec  2 14:02 .
-rw-------.  1 stolon stolon   78 Dec  2 14:02 dbstate
-rw-------.  1 stolon stolon   41 Nov 30 12:18 keeperstate
-rw-r--r--.  1 stolon stolon    0 Nov 30 12:18 lock
drwx------. 19 stolon stolon 4096 Dec  2 13:53 postgres

# ps -ef | grep [p]ostgres
stolon      35     1  0 13:53 ?        00:00:00 postgres -D /stolon-data/postgres -c unix_socket_directories=/tmp
stolon      40    35  0 13:53 ?        00:00:00 postgres: checkpointer
stolon      41    35  0 13:53 ?        00:00:00 postgres: background writer
stolon      42    35  0 13:53 ?        00:00:00 postgres: stats collector
stolon      74    35  0 13:53 ?        00:00:00 postgres: walwriter
stolon      75    35  0 13:53 ?        00:00:00 postgres: autovacuum launcher
stolon      76    35  0 13:53 ?        00:00:00 postgres: logical replication launcher
stolon     130    35  0 13:53 ?        00:00:00 postgres: walsender repluser 10.34.0.18(40610) streaming 0/6000148
stolon     444    35  0 13:54 ?        00:00:00 postgres: walsender repluser 10.32.0.33(55322) streaming 0/6000148

Best regards,
Andreas