Architecture:
- node1 -> Consul Server [192.168.100.104]
- sql1 -> Consul Agent + Postgres14 [192.168.100.101]
- sql2 -> Consul Agent + Postgres14 [192.168.100.102]
Consul v1.6.9
Debian 11
Stolon v0.17.0
My steps:
-
I installed Postgres14 on sql2 server and I created a custom database with some data.
-
I ran following commands in psql:
ALTER USER postgres WITH PASSWORD 'password'; CREATE USER repluser WITH PASSWORD 'password'; ALTER USER repluser REPLICATION;
-
Successfully initialized cluster with:
stolonctl init --cluster-name sql --store-backend consul
-
I ran sentinel:
stolon-sentinel --cluster-name sql --store-backend consul
output - click to expand
2022-03-23T15:08:26.164+0100 INFO cmd/sentinel.go:2000 sentinel uid {"uid": "0cd099f8"} 2022-03-23T15:08:26.166+0100 INFO cmd/sentinel.go:82 Trying to acquire sentinels leadership 2022-03-23T15:08:26.190+0100 INFO cmd/sentinel.go:89 sentinel leadership acquired 2022-03-23T15:08:26.209+0100 INFO cmd/sentinel.go:837 waiting for db {"db": "d27faa20", "keeper": "sql1"} 2022-03-23T15:08:31.239+0100 WARN cmd/sentinel.go:276 no keeper info available {"db": "d27faa20", "keeper": "sql1"} 2022-03-23T15:08:31.243+0100 INFO cmd/sentinel.go:837 waiting for db {"db": "d27faa20", "keeper": "sql1"} 2022-03-23T15:08:36.315+0100 WARN cmd/sentinel.go:276 no keeper info available {"db": "d27faa20", "keeper": "sql1"}
-
And keeper (postgres service stopped):
stolon-keeper --cluster-name sql \ --store-backend=consul --uid sql1 \ --data-dir /var/lib/postgresql/14/main --pg-su-username=postgres --pg-su-password=password --pg-repl-username=repluser --pg-repl-password=password --pg-listen-address=192.168.100.101 --pg-bin-path /usr/lib/postgresql/14/bin
output errors - click to expand
2022-03-23T15:16:00.443+0100 INFO cmd/keeper.go:2091 exclusive lock on data dir taken 2022-03-23T15:16:00.444+0100 INFO cmd/keeper.go:569 keeper uid {"uid": "sql1"} 2022-03-23T15:16:00.447+0100 DEBUG cmd/keeper.go:822 cd dump: (*cluster.ClusterData)(0xc000318a40)({ FormatVersion: (uint64) 1, ChangeTime: (time.Time) 2022-03-23 15:15:55.977848705 +0100 CET, Cluster: (*cluster.Cluster)(0xc000366240)({ UID: (string) (len=8) "645cc58f", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:13:34.622224551 +0100 CET, Spec: (*cluster.ClusterSpec)(0xc00038f440)({ SleepInterval: (*cluster.Duration)(<nil>), RequestTimeout: (*cluster.Duration)(<nil>), ConvergenceTimeout: (*cluster.Duration)(<nil>), InitTimeout: (*cluster.Duration)(<nil>), SyncTimeout: (*cluster.Duration)(<nil>), DBWaitReadyTimeout: (*cluster.Duration)(<nil>), FailInterval: (*cluster.Duration)(<nil>), DeadKeeperRemovalInterval: (*cluster.Duration)(<nil>), ProxyCheckInterval: (*cluster.Duration)(<nil>), ProxyTimeout: (*cluster.Duration)(<nil>), MaxStandbys: (*uint16)(<nil>), MaxStandbysPerSender: (*uint16)(<nil>), MaxStandbyLag: (*uint32)(<nil>), SynchronousReplication: (*bool)(<nil>), MinSynchronousStandbys: (*uint16)(<nil>), MaxSynchronousStandbys: (*uint16)(<nil>), AdditionalWalSenders: (*uint16)(<nil>), AdditionalMasterReplicationSlots: ([]string) <nil>, UsePgrewind: (*bool)(<nil>), InitMode: (*cluster.ClusterInitMode)(0xc00033f830)((len=3) "new"), MergePgParameters: (*bool)(<nil>), Role: (*cluster.ClusterRole)(<nil>), NewConfig: (*cluster.NewConfig)(<nil>), PITRConfig: (*cluster.PITRConfig)(<nil>), ExistingConfig: (*cluster.ExistingConfig)(<nil>), StandbyConfig: (*cluster.StandbyConfig)(<nil>), DefaultSUReplAccessMode: (*cluster.SUReplAccessMode)(<nil>), PGParameters: (cluster.PGParameters) <nil>, PGHBA: ([]string) <nil>, AutomaticPgRestart: (*bool)(<nil>) }), Status: (cluster.ClusterStatus) { CurrentGeneration: (int64) 0, Phase: (cluster.ClusterPhase) (len=12) "initializing", Master: (string) (len=8) "49c40051" } }), Keepers: (cluster.Keepers) (len=1) { (string) (len=4) "sql1": (*cluster.Keeper)(0xc0003a07e0)({ UID: (string) (len=4) "sql1", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:15:55.977864513 +0100 CET, Spec: (*cluster.KeeperSpec)(0x23eeb88)({ }), Status: (cluster.KeeperStatus) { Healthy: (bool) false, LastHealthyTime: (time.Time) 2022-03-23 15:15:50.913820346 +0100 CET, BootUUID: (string) (len=36) "2f72e54a-9402-42a5-bbdd-36dbaeb7d52b", PostgresBinaryVersion: (cluster.PostgresBinaryVersion) { Maj: (int) 14, Min: (int) 2 }, ForceFail: (bool) false, CanBeMaster: (*bool)(0xc0003a624e)(true), CanBeSynchronousReplica: (*bool)(0xc0003a624f)(true) } }) }, DBs: (cluster.DBs) (len=1) { (string) (len=8) "49c40051": (*cluster.DB)(0xc00034e1e0)({ UID: (string) (len=8) "49c40051", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:13:34.622225289 +0100 CET, Spec: (*cluster.DBSpec)(0xc000164460)({ KeeperUID: (string) (len=4) "sql1", RequestTimeout: (cluster.Duration) 10s, MaxStandbys: (uint16) 20, SynchronousReplication: (bool) false, UsePgrewind: (bool) false, AdditionalWalSenders: (uint16) 5, AdditionalReplicationSlots: ([]string) <nil>, InitMode: (cluster.DBInitMode) (len=3) "new", NewConfig: (*cluster.NewConfig)(<nil>), PITRConfig: (*cluster.PITRConfig)(<nil>), PGParameters: (cluster.PGParameters) <nil>, PGHBA: ([]string) <nil>, Role: (common.Role) (len=6) "master", FollowConfig: (*cluster.FollowConfig)(<nil>), Followers: ([]string) { }, IncludeConfig: (bool) true, SynchronousStandbys: ([]string) <nil>, ExternalSynchronousStandbys: ([]string) <nil> }), Status: (cluster.DBStatus) { Healthy: (bool) false, CurrentGeneration: (int64) 0, ListenAddress: (string) "", Port: (string) "", SystemID: (string) "", TimelineID: (uint64) 0, XLogPos: (uint64) 0, TimelinesHistory: (cluster.PostgresTimelinesHistory) <nil>, PGParameters: (cluster.PGParameters) <nil>, CurSynchronousStandbys: ([]string) <nil>, SynchronousStandbys: ([]string) <nil>, OlderWalFile: (string) "" } }) }, Proxy: (*cluster.Proxy)(0xc0003662a0)({ UID: (string) "", Generation: (int64) 0, ChangeTime: (time.Time) 0001-01-01 00:00:00 +0000 UTC, Spec: (cluster.ProxySpec) { MasterDBUID: (string) "", EnabledProxies: ([]string) <nil> }, Status: (cluster.ProxyStatus) { } }) }) 2022-03-23T15:16:00.457+0100 DEBUG postgresql/postgresql.go:625 execing cmd {"cmd": "/usr/lib/postgresql/14/bin/postgres -V"} 2022-03-23T15:16:00.459+0100 DEBUG cmd/keeper.go:1045 cd dump: (*cluster.ClusterData)(0xc00007ce40)({ FormatVersion: (uint64) 1, ChangeTime: (time.Time) 2022-03-23 15:15:55.977848705 +0100 CET, Cluster: (*cluster.Cluster)(0xc00007f6e0)({ UID: (string) (len=8) "645cc58f", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:13:34.622224551 +0100 CET, Spec: (*cluster.ClusterSpec)(0xc00003d7a0)({ SleepInterval: (*cluster.Duration)(<nil>), RequestTimeout: (*cluster.Duration)(<nil>), ConvergenceTimeout: (*cluster.Duration)(<nil>), InitTimeout: (*cluster.Duration)(<nil>), SyncTimeout: (*cluster.Duration)(<nil>), DBWaitReadyTimeout: (*cluster.Duration)(<nil>), FailInterval: (*cluster.Duration)(<nil>), DeadKeeperRemovalInterval: (*cluster.Duration)(<nil>), ProxyCheckInterval: (*cluster.Duration)(<nil>), ProxyTimeout: (*cluster.Duration)(<nil>), MaxStandbys: (*uint16)(<nil>), MaxStandbysPerSender: (*uint16)(<nil>), MaxStandbyLag: (*uint32)(<nil>), SynchronousReplication: (*bool)(<nil>), MinSynchronousStandbys: (*uint16)(<nil>), MaxSynchronousStandbys: (*uint16)(<nil>), AdditionalWalSenders: (*uint16)(<nil>), AdditionalMasterReplicationSlots: ([]string) <nil>, UsePgrewind: (*bool)(<nil>), InitMode: (*cluster.ClusterInitMode)(0xc00004b5f0)((len=3) "new"), MergePgParameters: (*bool)(<nil>), Role: (*cluster.ClusterRole)(<nil>), NewConfig: (*cluster.NewConfig)(<nil>), PITRConfig: (*cluster.PITRConfig)(<nil>), ExistingConfig: (*cluster.ExistingConfig)(<nil>), StandbyConfig: (*cluster.StandbyConfig)(<nil>), DefaultSUReplAccessMode: (*cluster.SUReplAccessMode)(<nil>), PGParameters: (cluster.PGParameters) <nil>, PGHBA: ([]string) <nil>, AutomaticPgRestart: (*bool)(<nil>) }), Status: (cluster.ClusterStatus) { CurrentGeneration: (int64) 0, Phase: (cluster.ClusterPhase) (len=12) "initializing", Master: (string) (len=8) "49c40051" } }), Keepers: (cluster.Keepers) (len=1) { (string) (len=4) "sql1": (*cluster.Keeper)(0xc0001966c0)({ UID: (string) (len=4) "sql1", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:15:55.977864513 +0100 CET, Spec: (*cluster.KeeperSpec)(0x23eeb88)({ }), Status: (cluster.KeeperStatus) { Healthy: (bool) false, LastHealthyTime: (time.Time) 2022-03-23 15:15:50.913820346 +0100 CET, BootUUID: (string) (len=36) "2f72e54a-9402-42a5-bbdd-36dbaeb7d52b", PostgresBinaryVersion: (cluster.PostgresBinaryVersion) { Maj: (int) 14, Min: (int) 2 }, ForceFail: (bool) false, CanBeMaster: (*bool)(0xc000040f4e)(true), CanBeSynchronousReplica: (*bool)(0xc000040f4f)(true) } }) }, DBs: (cluster.DBs) (len=1) { (string) (len=8) "49c40051": (*cluster.DB)(0xc0001d2000)({ UID: (string) (len=8) "49c40051", Generation: (int64) 1, ChangeTime: (time.Time) 2022-03-23 15:13:34.622225289 +0100 CET, Spec: (*cluster.DBSpec)(0xc0001d4000)({ KeeperUID: (string) (len=4) "sql1", RequestTimeout: (cluster.Duration) 10s, MaxStandbys: (uint16) 20, SynchronousReplication: (bool) false, UsePgrewind: (bool) false, AdditionalWalSenders: (uint16) 5, AdditionalReplicationSlots: ([]string) <nil>, InitMode: (cluster.DBInitMode) (len=3) "new", NewConfig: (*cluster.NewConfig)(<nil>), PITRConfig: (*cluster.PITRConfig)(<nil>), PGParameters: (cluster.PGParameters) <nil>, PGHBA: ([]string) <nil>, Role: (common.Role) (len=6) "master", FollowConfig: (*cluster.FollowConfig)(<nil>), Followers: ([]string) { }, IncludeConfig: (bool) true, SynchronousStandbys: ([]string) <nil>, ExternalSynchronousStandbys: ([]string) <nil> }), Status: (cluster.DBStatus) { Healthy: (bool) false, CurrentGeneration: (int64) 0, ListenAddress: (string) "", Port: (string) "", SystemID: (string) "", TimelineID: (uint64) 0, XLogPos: (uint64) 0, TimelinesHistory: (cluster.PostgresTimelinesHistory) <nil>, PGParameters: (cluster.PGParameters) <nil>, CurSynchronousStandbys: ([]string) <nil>, SynchronousStandbys: ([]string) <nil>, OlderWalFile: (string) "" } }) }, Proxy: (*cluster.Proxy)(0xc00007f740)({ UID: (string) "", Generation: (int64) 0, ChangeTime: (time.Time) 0001-01-01 00:00:00 +0000 UTC, Spec: (cluster.ProxySpec) { MasterDBUID: (string) "", EnabledProxies: ([]string) <nil> }, Status: (cluster.ProxyStatus) { } }) }) 2022-03-23T15:16:00.459+0100 INFO cmd/keeper.go:1094 our db boot UID is different than the cluster data one, waiting for it to be updated {"bootUUID": "c8626456-eed1-429d-81a6-492fd0c6aa03", "clusterBootUUID": "2f72e54a-9402-42a5-bbdd-36dbaeb7d52b"} 2022-03-23T15:16:00.460+0100 ERROR cmd/keeper.go:1096 failed to stop pg instance {"error": "cannot get instance state: exit status 1"} 2022-03-23T15:16:00.469+0100 DEBUG cmd/keeper.go:799 pgstate dump: (*cluster.PostgresState)(0xc00023a6e0)({ UID: (string) (len=8) "a0cb6ea7", Generation: (int64) 0, ListenAddress: (string) (len=15) "192.168.100.101", Port: (string) (len=4) "5432", Healthy: (bool) false, SystemID: (string) "", TimelineID: (uint64) 0, XLogPos: (uint64) 0, TimelinesHistory: (cluster.PostgresTimelinesHistory) <nil>, PGParameters: (common.Parameters) <nil>, SynchronousStandbys: ([]string) <nil>, OlderWalFile: (string) "" }) ^C2022-03-23T15:16:04.467+0100 DEBUG cmd/keeper.go:1894 got signal {"signal": "interrupt"} 2022-03-23T15:16:04.467+0100 DEBUG cmd/keeper.go:841 stopping stolon keeper 2022-03-23T15:16:04.468+0100 ERROR cmd/keeper.go:843 failed to stop pg instance {"error": "cannot get instance state: exit status 1"}
-
Also I tried with started Postgresql service, but without luck.
-
Configs:
- postgresql.conf:
listen_addresses = '*' password_encryption = md5
- pg_hba.conf:
local all postgres peer local all all peer host all all 127.0.0.1/32 md5 host all all 192.168.100.0/24 md5 host all all ::1/128 md5 host all all 127.0.0.1/32 scram-sha-256 host all all 192.168.100.0/24 scram-sha-256 host all all 10.0.0.0/16 scram-sha-256 host all all ::1/128 scram-sha-256 host replication all 127.0.0.1/32 md5 host replication all ::1/128 md5 host replication all 127.0.0.1/32 scram-sha-256 host replication all ::1/128 scram-sha-256 `
- postgresql.conf:
Am I doing something wrong?
Thanks for your help!