我已经使用PGpool很长一段时间了,但我只是注意到一个间歇性的滞后(缓慢的请求需要20来执行),我能够始终如一地复制它。在我解释之前,先简要介绍一下底层体系结构:
通过将连接直接指向主服务器,我排除了缓慢的查询。只有在随后重复地从不同的客户端应用程序查询pgpool时,我才能始终如一地复制滞后,这种滞后通常发生在第二次迭代时。我注意到,通过转动connection_cache=off,延迟不那么频繁,也没有那么糟糕,但仍然会发生。我试着打开pgpool日志来解决这个问题,但是在跟踪pgpool.log之后,有太多的信息让我不知道应该查找什么,特别是使用grep,因为字符串ERROR在滞后发生时不会产生任何结果。
以下是配置:
listen_addresses = '*'
port = 5432
socket_dir = '/var/run/postgresql/'
listen_backlog_multiplier = 2
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/var/run/postgresql/'
backend_hostname0 = 'database3-master'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/pgsql/data'
backend_flag0 = 'DISALLOW_TO_FAILOVER'
backend_hostname1 = 'database3-replica'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/data'
backend_flag1 = 'DISALLOW_TO_FAILOVER'
backend_hostname2 = 'database3-replica2'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/var/lib/pgsql/data'
backend_flag2 = 'DISALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 60
client_idle_limit = 0
log_destination = 'stderr'
log_connections = on
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 1
pid_file_name = '/var/run/pgpool/pgpool.pid'
logdir = '/var/log/pgpool/'
connection_cache = off
replication_mode = off
replicate_select = off
insert_lock = on
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
black_function_list = 'nextval,setval,nextval,setval'
allow_sql_comments = off
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'pgpool'
sr_check_password = 'password'
delay_threshold = 0
follow_master_command = '/bin/echo %M > /tmp/postgres_master'
health_check_period = 30
health_check_timeout = 20
health_check_user = 'pg_produser'
health_check_password = '9password'
health_check_max_retries = 0
health_check_retry_delay = 1
connect_timeout = 10000
failover_command = '/etc/pgpool-II/failover.sh %d %H %P /tmp/postgresql.trigger.failover startup-pgpool4'
fail_over_on_backend_error = on
search_primary_node_timeout = 10
recovery_user = 'pgpool'
recovery_password = 'password'
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
wd_hostname = 'pgpool3'
wd_port = 9000
wd_authkey = ''
wd_escalation_command = '/bin/bash /etc/pgpool-II/pgpool-failover.sh'
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
heartbeat_destination0 = 'pgpool4'
heartbeat_destination_port0 = 9694
other_pgpool_hostname0 = 'pgpool4'
other_pgpool_port0 = 5432
other_wd_port0 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
check_unlogged_table = on
memory_cache_enabled = off我一直在认真考虑另一种解决办法,但找不到多少。理想情况下,我想要的是一些中间件,它将分别将读/写请求拆分到主程序和从服务器。HAPROXY没有能力这样做,因为它不解析查询。
在应用程序级别,是否有一个分离读写查询的理想解决方案?我假设任何带有UPDATE/INSERT的查询都会转到主服务器,但我觉得好像遗漏了什么。
因此,概括地说:
发布于 2017-02-23 02:55:22
我似乎已经部分地想出了我的问题的第一个答案,我也找到了一个解决办法,完全减少了滞后时间,但它可能会在以后的道路上提出问题。
首先,我如何能够破译一些有意义的调试:
/bin/sh -c /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -d -n传递用于调试的-d参数。tcpdump -i eth1 | grep '[clientname] * > [pgpoolhosname]'。tail -f /var/log/pgpool/pgpool.log | grep "LOG\|error\|anythingelseuseful"我能够确定正在建立TCP连接,但是pgpool直到几秒钟后才对任何东西作出响应。这使我相信空闲连接接受新请求存在问题。
在pgpool中,空闲连接的数目用以下方法指定:
num_init_children = 20
# Number of pools
# (change requires restart)
max_pool = 4
# Number of connections per pool
# (change requires restart)我只是将num_init_children改为100,max_pool更改为2。
但是,这可能会带来不同的问题,因为现在服务器上同时运行的空闲进程多达200个。看起来,在池连接中乱搞设置似乎解决了滞后问题。有关这些变量的更多信息可以在这里找到:
http://www.pgpool.net/mediawiki/index.php/Relationship_在两者之间_最大值_泳池,_数量_初始化_孩子们_和_最大值_连接http://www.pgpool.net/mediawiki/index.php/Relationship_在两者之间_最大值_泳池,_数量_初始化_孩子们_和_最大值_连接
https://dba.stackexchange.com/questions/165243
复制相似问题