pg_is_in_recovery（）が正しく表示されているにもかかわらず、PgPoolがすべてのノードをスタンバイとして表示するのはなぜですか？

Question

私は1つのpgpool、1つのプライマリpostgres、2つのスタンバイpostgresをセットアップしています。そして、私は次の問題を見ています：

問題1：Pgpoolはプライマリを検出できず、すべてがスタンバイであると言います。

[root@ip-172-22-3-228 data]# Sudo -u postgres psql -h 172.21.3.41 -p 5432 -x -c "show pool_nodes;" Password: -[ RECORD 1 ]------------ node_id | 0 hostname | 172.21.3.229 port | 5432 status | 3 lb_weight | 0.333333 role | standby select_cnt | 0 -[ RECORD 2 ]------------ node_id | 1 hostname | 172.21.2.88 port | 5432 status | 2 lb_weight | 0.333333 role | standby select_cnt | 0 -[ RECORD 3 ]------------ node_id | 2 hostname | 172.22.3.228 port | 5432 status | 0 lb_weight | 0.333333 role | standby select_cnt | 0

これは、すべてがプライマリ 'f'で、どちらがスタンバイではないかを正しく示しているすべてのノードでのpg_is_in_recovery（）の出力です。

[root@ip-172-22-3-228 data]# Sudo -u postgres psql -h 172.21.3.229 -p 5432 -x -c "select pg_is_in_recovery();" Password: -[ RECORD 1 ]-----+-- pg_is_in_recovery | f [root@ip-172-22-3-228 data]# Sudo -u postgres psql -h 172.21.2.88 -p 5432 -x -c "select pg_is_in_recovery();" Password: -[ RECORD 1 ]-----+-- pg_is_in_recovery | t [root@ip-172-22-3-228 data]# Sudo -u postgres psql -h 172.22.3.228 -p 5432 -x -c "select pg_is_in_recovery();" Password: -[ RECORD 1 ]-----+-- pg_is_in_recovery | t

問題2：ステータス2のスタンバイの1つのみとの永続的な接続を作成します

Pgpoolからのログは次のとおりです。

2016-12-18 17:16:41: pid 24793: DEBUG: loading hba configuration 2016-12-18 17:16:41: pid 24793: DETAIL: loading file :"/etc/pgpool-II/pool_hba.conf" for client authentication configuration file 2016-12-18 17:16:41: pid 24793: LOG: reading status file: 0 th backend is set to down status 2016-12-18 17:16:41: pid 24793: LOG: reading status file: 2 th backend is set to down status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_coninfo_size: num_init_children (20) * max_pool (10) * MAX_NUM_BACKENDS (128) * sizeof(ConnectionInfo) (136) = 3481600 bytes requested for shared memory 2016-12-18 17:16:41: pid 24793: DEBUG: ProcessInfo: num_init_children (20) * sizeof(ProcessInfo) (32) = 640 bytes requested for shared memory 2016-12-18 17:16:41: pid 24793: DEBUG: Request info are: sizeof(POOL_REQUEST_INFO) 5224 bytes requested for shared memory 2016-12-18 17:16:41: pid 24793: DEBUG: Recovery management area: sizeof(int) 4 bytes requested for shared memory 2016-12-18 17:16:41: pid 24793: LOG: Setting up socket for 0.0.0.0:5432 2016-12-18 17:16:41: pid 24793: LOG: Setting up socket for :::5432 2016-12-18 17:16:41: pid 24794: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: LOG: pgpool-II successfully started. version 3.5.4 (ekieboshi) 2016-12-18 17:16:41: pid 24793: LOG: find_primary_node: checking backend no 0 2016-12-18 17:16:41: pid 24793: LOG: find_primary_node: checking backend no 1 2016-12-18 17:16:41: pid 24795: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24796: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24797: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24798: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24799: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_read: read 13 bytes from backend 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate kind = 5 2016-12-18 17:16:41: pid 24793: DEBUG: pool_write: to backend: 0 kind:p 2016-12-18 17:16:41: pid 24800: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24801: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24802: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_read: read 326 bytes from backend 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate kind = 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate backend: key data received 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate backend: transaction state: I 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: extended:0 query:"SELECT pg_is_in_recovery()" 2016-12-18 17:16:41: pid 24793: DEBUG: pool_write: to backend: 0 kind:Q 2016-12-18 17:16:41: pid 24803: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_read: read 75 bytes from backend 0 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: kind: 'T' 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: received ROW DESCRIPTION ('T') 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: row description: num_fileds: 1 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: kind: 'D' 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: received DATA ROW ('D') 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: kind: 'C' 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: received COMMAND COMPLETE ('C') 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: kind: 'Z' 2016-12-18 17:16:41: pid 24793: DEBUG: do_query: received READY FOR QUERY ('Z') 2016-12-18 17:16:41: pid 24793: DEBUG: pool_write: to backend: 0 kind:X 2016-12-18 17:16:41: pid 24793: DEBUG: find_primary_node: 1 node is standby 2016-12-18 17:16:41: pid 24793: LOG: find_primary_node: checking backend no 2 2016-12-18 17:16:41: pid 24793: DEBUG: find_primary_node: no primary node found 2016-12-18 17:16:41: pid 24804: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: starting health check 2016-12-18 17:16:41: pid 24793: DEBUG: health check: clearing alarm 2016-12-18 17:16:41: pid 24793: DEBUG: doing health check against database:postgres user:postgres 2016-12-18 17:16:41: pid 24793: DEBUG: Backend DB node 0 status is 3 2016-12-18 17:16:41: pid 24793: DEBUG: Backend DB node 1 status is 2 2016-12-18 17:16:41: pid 24793: DEBUG: Trying to make persistent DB connection to backend node 1 having status 2 2016-12-18 17:16:41: pid 24805: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24806: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24807: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24808: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24809: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24810: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_read: read 13 bytes from backend 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate kind = 5 2016-12-18 17:16:41: pid 24793: DEBUG: pool_write: to backend: 0 kind:p 2016-12-18 17:16:41: pid 24811: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24812: DEBUG: initializing backend status 2016-12-18 17:16:41: pid 24793: DEBUG: pool_read: read 318 bytes from backend 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate kind = 0 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate backend: key data received 2016-12-18 17:16:41: pid 24793: DEBUG: authenticate backend: transaction state: I 2016-12-18 17:16:41: pid 24793: DEBUG: persistent DB connection to backend node 1 having status 2 is successful 2016-12-18 17:16:41: pid 24793: DEBUG: pool_write: to backend: 0 kind:X 2016-12-18 17:16:41: pid 24793: DEBUG: Backend DB node 2 status is 3

ここで何が問題になっているのか、誰か手掛かりがありますか？これがpgpoolの設定です：

# ---------------------------- # pgPool-II configuration file # ---------------------------- #------------------------------------------------------------------------------ # CONNECTIONS #------------------------------------------------------------------------------ # - pgpool Connection Settings - listen_addresses = '*' # Host name or IP address to listen on: # '*' for all, '' for no TCP/IP connections # (change requires restart) port = 5432 # Port number # (change requires restart) socket_dir = '/var/run/postgresql' # Unix domain socket path # The Debian package defaults to # /var/run/postgresql # (change requires restart) listen_backlog_multiplier = 2 # Set the backlog parameter of listen(2) to # num_init_children * listen_backlog_multiplier. # (change requires restart) serialize_accept = on # whether to serialize accept() call to avoid thundering herd problem # (change requires restart) # - pgpool Communication Manager Connection Settings - pcp_listen_addresses = '*' # Host name or IP address for pcp process to listen on: # '*' for all, '' for no TCP/IP connections # (change requires restart) pcp_port = 9898 # Port number for pcp # (change requires restart) pcp_socket_dir = '/var/run/postgresql' # Unix domain socket path for pcp # The Debian package defaults to # /var/run/postgresql # (change requires restart) # - Backend Connection Settings - backend_hostname0 = '172.21.3.229' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/var/lib/pgsql/9.6/data' backend_flag0 = 'ALLOW_TO_FAILOVER' backend_hostname1 = '172.21.2.88' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/var/lib/pgsql/9.6/data' backend_flag1 = 'ALLOW_TO_FAILOVER' backend_hostname2 = '172.22.3.228' backend_port2 = 5432 backend_weight2 = 1 backend_data_directory2 = '/var/lib/pgsql/9.6/data' backend_flag2 = 'ALLOW_TO_FAILOVER' # - Authentication - enable_pool_hba = on # Use pool_hba.conf for client authentication pool_passwd = 'pool_passwd' # File name of pool_passwd for md5 authentication. # "" disables pool_passwd. # (change requires restart) authentication_timeout = 60 # Delay in seconds to complete client authentication # 0 means no timeout. # - SSL Connections - ssl = off # Enable SSL support # (change requires restart) #ssl_key = './server.key' # Path to the SSL private key file # (change requires restart) #ssl_cert = './server.cert' # Path to the SSL public certificate file # (change requires restart) #ssl_ca_cert = '' # Path to a single PEM format file # containing CA root certificate(s) # (change requires restart) #ssl_ca_cert_dir = '' # Directory containing CA root certificate(s) # (change requires restart) #------------------------------------------------------------------------------ # POOLS #------------------------------------------------------------------------------ # - Concurrent session and pool size - num_init_children = 20 # Number of concurrent sessions allowed # (change requires restart) max_pool = 10 # Number of connection pool caches per connection # (change requires restart) # - Life time - child_life_time = 300 # Pool exits after being idle for this many seconds child_max_connections = 0 # Pool exits after receiving that many connections # 0 means no exit connection_life_time = 0 # Connection to backend closes after being idle for this many seconds # 0 means no close client_idle_limit = 0 # Client is disconnected after being idle for that many seconds # (even inside an explicit transactions!) # 0 means no disconnection #------------------------------------------------------------------------------ # LOGS #------------------------------------------------------------------------------ # - Where to log - log_destination = 'stderr,syslog' # Where to log # Valid values are combinations of stderr, # and syslog. Default to stderr. # - What to log - print_timestamp = on # Print timestamp on each line # (change requires restart) log_connections = on # Log connections log_hostname = on # Hostname will be shown in ps status # and in logs if connections are logged log_statement = on # Log all statements log_per_node_statement = on # Log all statements # with node and backend informations log_standby_delay = 'none' # Log standby delay # Valid values are combinations of always, # if_over_threshold, none # - Syslog specific - syslog_facility = 'LOCAL0' # Syslog local facility. Default to LOCAL0 syslog_ident = 'pgpool' # Syslog program identification string # Default to 'pgpool' # - Debug - debug_level = 1 # Debug message verbosity level # 0 means no message, 1 or more mean verbose #log_error_verbosity = default # terse, default, or verbose messages #client_min_messages = notice # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = warning # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #------------------------------------------------------------------------------ # FILE LOCATIONS #------------------------------------------------------------------------------ pid_file_name = '/var/run/pgpool/pgpool.pid' # PID file name # (change requires restart) logdir = '/var/log/pgpool' # Directory of pgPool status file # (change requires restart) #------------------------------------------------------------------------------ # CONNECTION POOLING #------------------------------------------------------------------------------ connection_cache = on # Activate connection pools # (change requires restart) # Semicolon separated list of queries # to be issued at the end of a session # The default is for 8.3 and later reset_query_list = 'ABORT; DISCARD ALL' # The following one is for 8.2 and before #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' #------------------------------------------------------------------------------ # REPLICATION MODE #------------------------------------------------------------------------------ replication_mode = off # Activate replication mode # (change requires restart) replicate_select = off # Replicate SELECT statements # when in replication mode # replicate_select is higher priority than # load_balance_mode. insert_lock = on # Automatically locks a dummy row or a table # with INSERT statements to keep SERIAL data # consistency # Without SERIAL, no lock will be issued lobj_lock_table = '' # When rewriting lo_creat command in # replication mode, specify table name to # lock # - Degenerate handling - replication_stop_on_mismatch = off # On disagreement with the packet kind # sent from backend, degenerate the node # which is most likely "minority" # If off, just force to exit this session failover_if_affected_tuples_mismatch = off # On disagreement with the number of affected # tuples in UPDATE/DELETE queries, then # degenerate the node which is most likely # "minority". # If off, just abort the transaction to # keep the consistency #------------------------------------------------------------------------------ # LOAD BALANCING MODE #------------------------------------------------------------------------------ load_balance_mode = off # Activate load balancing mode # (change requires restart) ignore_leading_white_space = on # Ignore leading white spaces of each query white_function_list = '' # Comma separated list of function names # that don't write to database # Regexp are accepted black_function_list = 'nextval,setval' # Comma separated list of function names # that write to database # Regexp are accepted database_redirect_preference_list = '' # comma separated list of pairs of database and node id. # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2' # valid for streaming replicaton mode only. app_name_redirect_preference_list = '' # comma separated list of pairs of app name and node id. # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby' # valid for streaming replicaton mode only. allow_sql_comments = off # if on, ignore SQL comments when judging if load balance or # query cache is possible. # If off, SQL comments effectively prevent the judgment # (pre 3.4 behavior). #------------------------------------------------------------------------------ # MASTER/SLAVE MODE #------------------------------------------------------------------------------ master_slave_mode = on # Activate master/slave mode # (change requires restart) master_slave_sub_mode = 'stream' # Master/slave sub mode # Valid values are combinations slony or # stream. Default is slony. # (change requires restart) # - Streaming - sr_check_period = 10 # Streaming replication check period # Disabled (0) by default sr_check_user = 'replication_user' # Streaming replication check user # This is necessary even if you disable # streaming replication delay check with # sr_check_period = 0 sr_check_password = 'replication_pass' # Password for streaming replication check user sr_check_database = 'replication_db' # Database name for streaming replication check delay_threshold = 0 # Threshold before not dispatching query to standby node # Unit is in bytes # Disabled (0) by default # - Special commands - follow_master_command = '' # Executes this command after master failover # Special values: # %d = node id # %h = Host name # %p = port number # %D = database cluster path # %m = new master node id # %H = hostname of the new master node # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %% = '%' character #------------------------------------------------------------------------------ # HEALTH CHECK #------------------------------------------------------------------------------ health_check_period = 5 # Health check period # Disabled (0) by default health_check_timeout = 20 # Health check timeout # 0 means no timeout health_check_user = 'postgres' # Health check user health_check_password = 'postgres' # Password for health check user health_check_database = 'postgres' # Database name for health check. If '', tries 'postgres' frist, then 'template1' health_check_max_retries = 2 # Maximum number of times to retry a failed health check before giving up. health_check_retry_delay = 1 # Amount of time to wait (in seconds) between retries. connect_timeout = 10000 # Timeout value in milliseconds before giving up to connect to backend. # Default is 10000 ms (10 second). Flaky network user may want to increase # the value. 0 means no timeout. # Note that this value is not only used for health check, # but also for ordinary conection to backend. #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_command = '' # Executes this command at failover # Special values: # %d = node id # %h = Host name # %p = port number # %D = database cluster path # %m = new master node id # %H = hostname of the new master node # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %% = '%' character failback_command = '' # Executes this command at failback. # Special values: # %d = node id # %h = Host name # %p = port number # %D = database cluster path # %m = new master node id # %H = hostname of the new master node # %M = old master node id # %P = old primary node id # %r = new master port number # %R = new master database cluster path # %% = '%' character fail_over_on_backend_error = on # Initiates failover when reading/writing to the # backend communication socket fails # If set to off, pgpool will report an # error and disconnect the session. #search_primary_node_timeout = 10 # Timeout in seconds to search for the # primary node when a failover occurs. # 0 means no timeout, keep searching # for a primary node forever. #------------------------------------------------------------------------------ # ONLINE RECOVERY #------------------------------------------------------------------------------ recovery_user = 'postgres' # Online recovery user recovery_password = 'postgres' # Online recovery password recovery_1st_stage_command = '' # Executes a command in first stage recovery_2nd_stage_command = '' # Executes a command in second stage recovery_timeout = 90 # Timeout in seconds to wait for the # recovering node's postmaster to start up # 0 means no wait client_idle_limit_in_recovery = 0 # Client is disconnected after being idle # for that many seconds in the second stage # of online recovery # 0 means no disconnection # -1 means immediate disconnection #------------------------------------------------------------------------------ # WATCHDOG #------------------------------------------------------------------------------ # - Enabling - use_watchdog = off # Activates watchdog # (change requires restart) # -Connection to up stream servers - trusted_servers = '' # trusted server list which are used # to confirm network connection # (hostA,hostB,hostC,...) # (change requires restart) ping_path = '/bin' # ping command path # (change requires restart) # - Watchdog communication Settings - wd_hostname = '' # Host name or IP address of this watchdog # (change requires restart) wd_port = 9000 # port number for watchdog service # (change requires restart) wd_priority = 1 # priority of this watchdog in leader election # (change requires restart) wd_authkey = '' # Authentication key for watchdog communication # (change requires restart) wd_ipc_socket_dir = '/var/run/postgresql' # Unix domain socket path for watchdog IPC socket # The Debian package defaults to # /var/run/postgresql # (change requires restart) # - Virtual IP control Setting - delegate_IP = '' # delegate IP address # If this is empty, virtual IP never bring up. # (change requires restart) if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd exists # (change requires restart) if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0' # startup delegate IP command # (change requires restart) if_down_cmd = 'ip addr del $_IP_$/24 dev eth0' # shutdown delegate IP command # (change requires restart) arping_path = '/usr/sbin' # arping command path # (change requires restart) arping_cmd = 'arping -U $_IP_$ -w 1' # arping command # (change requires restart) # - Behaivor on escalation Setting - clear_memqcache_on_escalation = on # Clear all the query cache on shared memory # when standby pgpool escalate to active pgpool # (= virtual IP holder). # This should be off if client connects to pgpool # not using virtual IP. # (change requires restart) wd_escalation_command = '' # Executes this command at escalation on new active pgpool. # (change requires restart) wd_de_escalation_command = '' # Executes this command when master pgpool resigns from being master. # (change requires restart) # - Lifecheck Setting - # -- common -- wd_monitoring_interfaces_list = '' # Comma separated list of interfaces names to monitor. # if any interface from the list is active the watchdog will # consider the network is fine # 'any' to enable monitoring on all interfaces except loopback # '' to disable monitoring wd_lifecheck_method = 'heartbeat' # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external') # (change requires restart) wd_interval = 10 # lifecheck interval (sec) > 0 # (change requires restart) # -- heartbeat mode -- wd_heartbeat_port = 9694 # Port number for receiving heartbeat signal # (change requires restart) wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) heartbeat_destination0 = 'Host0_ip1' # Host name or IP address of destination 0 # for sending heartbeat signal. # (change requires restart) heartbeat_destination_port0 = 9694 # Port number of destination 0 for sending # heartbeat signal. Usually this is the # same as wd_heartbeat_port. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) #heartbeat_destination1 = 'Host0_ip2' #heartbeat_destination_port1 = 9694 #heartbeat_device1 = '' # -- query mode -- wd_life_point = 3 # lifecheck retry times # (change requires restart) wd_lifecheck_query = 'SELECT 1' # lifecheck query to pgpool from watchdog # (change requires restart) wd_lifecheck_dbname = 'template1' # Database name connected for lifecheck # (change requires restart) wd_lifecheck_user = 'postgres' # watchdog user monitoring pgpools in lifecheck # (change requires restart) wd_lifecheck_password = 'postgres' # Password for watchdog user in lifecheck # (change requires restart) # - Other pgpool Connection Settings - #other_pgpool_hostname0 = 'Host0' # Host name or IP address to connect to for other pgpool 0 # (change requires restart) #other_pgpool_port0 = 5432 # Port number for othet pgpool 0 # (change requires restart) #other_wd_port0 = 9000 # Port number for othet watchdog 0 # (change requires restart) #other_pgpool_hostname1 = 'Host1' #other_pgpool_port1 = 5432 #other_wd_port1 = 9000

Amit Kumar · Answer

私はここでこれの解決を得ました： http://www.pgpool.net/mantisbt/view.php?id=274