Error starting possibly corrupt DB

CMartinUdden · July 6, 2020, 11:44am

Hello!

I’m in a bit of trouble … I’m experiencing this startup error on one of my clusters:

2020-07-06 11:24:56.299812 [keeper.go:143] W | keeper: password file /etc/secrets/stolon/password permissions 01000000777 are too open. This file should only be readable to the user executing stolon! Continuing...
2020-07-06 11:24:56.300334 [keeper.go:1159] I | keeper: id: 79ffcec1
2020-07-06 11:24:56.300362 [keeper.go:1162] I | keeper: running under kubernetes.
time="2020-07-06T11:24:56Z" level=info msg="Stopping database"
2020-07-06 11:24:56.465437 [keeper.go:421] E | keeper: error getting pgstate: error getting pg state
2020-07-06 11:24:56.499052 [keeper.go:743] I | keeper: current pg state: master
2020-07-06 11:24:56.499083 [keeper.go:768] I | keeper: our cluster requested state is master
time="2020-07-06T11:24:56Z" level=info msg="Starting database"
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2020-07-06 11:24:56.671 UTC [77]: [1-1] user=,db= LOG:  database system was interrupted while in recovery at log time 2020-05-04 01:58:36 UTC
2020-07-06 11:24:56.671 UTC [77]: [2-1] user=,db= HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2020-07-06 11:24:56.757 UTC [77]: [3-1] user=,db= WARNING:  recovery command file "recovery.conf" specified neither primary_conninfo nor restore_command
2020-07-06 11:24:56.757 UTC [77]: [4-1] user=,db= HINT:  The database server will regularly poll the pg_xlog subdirectory to check for files placed there.
2020-07-06 11:24:56.757 UTC [77]: [5-1] user=,db= LOG:  entering standby mode
2020-07-06 11:24:56.761 UTC [77]: [6-1] user=,db= LOG:  consistent recovery state reached at 4/4F34DC20
2020-07-06 11:24:56.761 UTC [77]: [7-1] user=,db= LOG:  record with zero length at 4/4F34DC20
2020-07-06 11:24:56.762 UTC [76]: [1-1] user=,db= LOG:  database system is ready to accept read only connections

Seems I have to get the db in sync somehow. I’m not that familiar with Stolon, even though I do have some PGSQL experience.

Any pointers regarding how to re-sync the nodes and repair the index?