首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >KeyDB活动复制-加载Redis是在完全同步的内存中加载数据集。

KeyDB活动复制-加载Redis是在完全同步的内存中加载数据集。
EN

Server Fault用户
提问于 2021-06-25 07:25:09
回答 1查看 1.7K关注 0票数 2

我们已经安装了2个带有活动副本的keydb服务器,如下所述:https://docs.keydb.dev/docs/active-rep/

我们使用HAproxy将流量重定向到正确的服务器。所以我们现在的情况是:

代码语言:javascript
复制
keydb 001 - 10.0.0.7
keydb 002 - 10.0.0.8

我们希望更新和重新启动keydb 01。我们在HAproxy中进行了维护,所有的连接都被耗尽了。因此,服务器不再使用,所有活动连接都将转到keydb 02。

现在,当keydb 01再次恢复时,它将请求keydb 02进行完整的db同步。完成后,我们看到keydb 02也从keydb 01请求完整的db同步!这将导致keydb 02处于加载状态,而它是Haproxy中唯一的活动服务器。

因此,结果是在短时间内,没有keydb服务器处于活动状态。这会导致错误,例如:加载Redis是在内存中加载数据集。

这个活动副本设置的全部思想是,它是健壮的,故障安全的,并创建一个高可用性的设置。然而,在这种情况下,这意味着每次服务器发生故障,都会造成短时间的完全停机。这对我们的设置是不可接受的。

我们做错了吗?这是故意的吗?还是我们必须重新配置什么东西?

我们已经测试了基于磁盘和无盘同步。没有关系。我们的配置设置(来自ansible剧本,因此格式化可能看起来有点奇怪):

代码语言:javascript
复制
        bind: "127.0.0.1 {{ my_private_ips[inventory_hostname] }}"
        requirepass: "{{ keydb_auth }}"
        masterauth: "{{ keydb_auth }}"
        replicaof: "xxx 6379"
        client-output-buffer-limit:
          - normal 0 0 0
          - replica 1024mb 256mb 60
          - pubsub 32mb 8mb 60
        repl-diskless-sync: yes
        port: 6379
        maxmemory: 3000m
        active-replica: yes

我们使用keydb版本6.0.16运行Ubuntu20.04.2LTS。

这里是keydb 01中的keydb日志文件,用于维护,然后再重新启动.

代码语言:javascript
复制
3812319:1439:C 18 Jun 2021 09:08:36.048 * DB saved on disk
3812319:1439:C 18 Jun 2021 09:08:36.054 * RDB: 3 MB of memory used by copy-on-write
958:1439:S 18 Jun 2021 09:08:36.094 * Background saving terminated with success
958:signal-handler (1624000133) Received SIGTERM scheduling shutdown...
958:signal-handler (1624000133) Received SIGTERM scheduling shutdown...
958:1439:S 18 Jun 2021 09:08:53.194 # User requested shutdown...
958:1439:S 18 Jun 2021 09:08:53.194 # systemd supervision requested, but NOTIFY_SOCKET not found
958:1439:S 18 Jun 2021 09:08:53.194 * Saving the final RDB snapshot before exiting.
958:1439:S 18 Jun 2021 09:08:53.194 # systemd supervision requested, but NOTIFY_SOCKET not found
958:1439:S 18 Jun 2021 09:08:54.159 * DB saved on disk
958:1439:S 18 Jun 2021 09:08:54.159 * Removing the pid file.
958:1439:S 18 Jun 2021 09:08:54.159 # KeyDB is now ready to exit, bye bye...
874:874:C 18 Jun 2021 09:09:04.528 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
874:874:C 18 Jun 2021 09:09:04.531 * Notice: "active-replica yes" implies "replica-read-only no"
874:874:C 18 Jun 2021 09:09:04.531 # oO0OoO0OoO0Oo KeyDB is starting oO0OoO0OoO0Oo
874:874:C 18 Jun 2021 09:09:04.531 # KeyDB version=6.0.16, bits=64, commit=00000000, modified=0, pid=874, just started
874:874:C 18 Jun 2021 09:09:04.531 # Configuration loaded
874:874:C 18 Jun 2021 09:09:04.531 # WARNING supervised by systemd - you MUST set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
874:874:C 18 Jun 2021 09:09:04.531 # systemd supervision requested, but NOTIFY_SOCKET not found


                                        KeyDB 6.0.16 (00000000/0) 64 bit

                                        Running in standalone mode
                                        Port: 6379
                                        PID: 957

                     Join the KeyDB community! https://community.keydb.dev/



957:874:S 18 Jun 2021 09:09:04.781 # Server initialized
957:874:S 18 Jun 2021 09:09:04.781 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
957:874:S 18 Jun 2021 09:09:04.781 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with KeyDB. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. KeyDB must be restarted after THP is disabled.
957:874:S 18 Jun 2021 09:09:04.789 * Loading RDB produced by version 6.0.16
957:874:S 18 Jun 2021 09:09:04.789 * RDB age 11 seconds
957:874:S 18 Jun 2021 09:09:04.789 * RDB memory usage when created 100.16 Mb
957:874:S 18 Jun 2021 09:09:05.563 * DB loaded from disk: 0.778 seconds
957:874:S 18 Jun 2021 09:09:05.563 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
957:874:S 18 Jun 2021 09:09:05.563 # systemd supervision requested, but NOTIFY_SOCKET not found
957:1345:S 18 Jun 2021 09:09:05.563   Thread 1 alive.
957:1344:S 18 Jun 2021 09:09:05.564   Thread 0 alive.
957:1344:S 18 Jun 2021 09:09:05.564 * Connecting to MASTER 10.0.0.8:6379
957:1344:S 18 Jun 2021 09:09:05.564 * MASTER <-> REPLICA sync started
957:1344:S 18 Jun 2021 09:09:05.566 * Non blocking connect for SYNC fired the event.
957:1344:S 18 Jun 2021 09:09:05.567 * Master replied to PING, replication can continue...
957:1344:S 18 Jun 2021 09:09:05.571 * Replica 10.0.0.8:6379 asks for synchronization
957:1344:S 18 Jun 2021 09:09:05.571 * Full resync requested by replica 10.0.0.8:6379
957:1344:S 18 Jun 2021 09:09:05.571 * Replication backlog created, my new replication IDs are '60759691a4adca42cced2fd1895a470ab7b9eebb' and '0000000000000000000000000000000000000000'
957:1344:S 18 Jun 2021 09:09:05.571 * Delay next BGSAVE for diskless SYNC
957:1344:S 18 Jun 2021 09:09:05.573 * Partial resynchronization not possible (no cached master)
957:1344:S 18 Jun 2021 09:09:11.521 * Full resync from master: 3eb1532df27d6815eea8a420c7c72b04e03618dc:516117186158
957:1344:S 18 Jun 2021 09:09:11.521 * Discarding previously cached master state.
957:1344:S 18 Jun 2021 09:09:11.530 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
957:1344:S 18 Jun 2021 09:09:11.603 * Starting BGSAVE for SYNC with target: replicas sockets
957:1344:S 18 Jun 2021 09:09:11.606 * Background RDB transfer started by pid 2412
2412:1344:C 18 Jun 2021 09:09:12.488 * RDB: 1 MB of memory used by copy-on-write
957:1344:S 18 Jun 2021 09:09:12.488 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
957:1344:S 18 Jun 2021 09:09:12.507 * Background RDB transfer terminated with success
957:1344:S 18 Jun 2021 09:09:12.507 * Streamed RDB transfer with replica 10.0.0.8:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
957:1344:S 18 Jun 2021 09:09:12.872 * Synchronization with replica 10.0.0.8:6379 succeeded
957:1344:S 18 Jun 2021 09:10:12.799 # I/O error trying to sync with MASTER: connection lost
957:1344:S 18 Jun 2021 09:10:13.429 * Connecting to MASTER 10.0.0.8:6379
957:1344:S 18 Jun 2021 09:10:13.429 * MASTER <-> REPLICA sync started
957:1344:S 18 Jun 2021 09:10:13.429 * Non blocking connect for SYNC fired the event.
957:1344:S 18 Jun 2021 09:10:13.429 * Master replied to PING, replication can continue...
957:1344:S 18 Jun 2021 09:10:13.430 * Partial resynchronization not possible (no cached master)
957:1344:S 18 Jun 2021 09:10:19.943 * Full resync from master: 349f88c33a6dc9d67a3cb4d623727d9f1047033d:516121643991
957:1344:S 18 Jun 2021 09:10:19.952 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
957:1344:S 18 Jun 2021 09:10:20.856 * MASTER <-> REPLICA sync: Loading DB in memory
957:1344:S 18 Jun 2021 09:10:20.856 * Loading RDB produced by version 6.0.16
957:1344:S 18 Jun 2021 09:10:20.856 * RDB age 1 seconds
957:1344:S 18 Jun 2021 09:10:20.856 * RDB memory usage when created 102.08 Mb
957:1344:S 18 Jun 2021 09:10:21.431 * MASTER <-> REPLICA sync: Finished with success
957:1344:S 18 Jun 2021 09:10:21.431 # systemd supervision requested, but NOTIFY_SOCKET not found
957:1344:S 18 Jun 2021 09:10:21.431 # systemd supervision requested, but NOTIFY_SOCKET not found
957:1344:S 18 Jun 2021 09:11:03.197 * 10000 changes in 60 seconds. Saving...
957:1344:S 18 Jun 2021 09:11:03.200 * Background saving started by pid 4845
4845:1344:C 18 Jun 2021 09:11:04.015 * DB saved on disk
4845:1344:C 18 Jun 2021 09:11:04.018 * RDB: 2 MB of memory used by copy-on-write
957:1344:S 18 Jun 2021 09:11:04.104 * Background saving terminated with success
957:1344:S 18 Jun 2021 09:12:05.071 * 10000 changes in 60 seconds. Saving...
957:1344:S 18 Jun 2021 09:12:05.075 * Background saving started by pid 4870
4870:1344:C 18 Jun 2021 09:12:05.963 * DB saved on disk

这里是keydb 02日志。这台服务器一直处于“up”状态,但很快就不可用了:

代码语言:javascript
复制
960:1388:S 18 Jun 2021 09:08:29.066 * Background saving started by pid 3814844
3814844:1388:C 18 Jun 2021 09:08:29.896 * DB saved on disk
3814844:1388:C 18 Jun 2021 09:08:29.901 * RDB: 3 MB of memory used by copy-on-write
960:1388:S 18 Jun 2021 09:08:29.974 * Background saving terminated with success
960:1388:S 18 Jun 2021 09:08:54.176 # Connection with master lost.
960:1388:S 18 Jun 2021 09:08:54.176 * Caching the disconnected master state.
960:1388:S 18 Jun 2021 09:08:54.177 # Connection with replica client id #11 lost.
960:1388:S 18 Jun 2021 09:08:54.957 * Connecting to MASTER 10.0.0.7:6379
960:1388:S 18 Jun 2021 09:08:54.957 * MASTER <-> REPLICA sync started
960:1388:S 18 Jun 2021 09:09:02.073 # Error condition on socket for SYNC: Resource temporarily unavailable
960:1388:S 18 Jun 2021 09:09:02.988 * Connecting to MASTER 10.0.0.7:6379
960:1388:S 18 Jun 2021 09:09:02.988 * MASTER <-> REPLICA sync started
960:1388:S 18 Jun 2021 09:09:02.988 # Error condition on socket for SYNC: Operation now in progress
960:1388:S 18 Jun 2021 09:09:03.998 * Connecting to MASTER 10.0.0.7:6379
960:1388:S 18 Jun 2021 09:09:03.998 * MASTER <-> REPLICA sync started
960:1388:S 18 Jun 2021 09:09:03.999 * Non blocking connect for SYNC fired the event.
960:1388:S 18 Jun 2021 09:09:04.068 * Master replied to PING, replication can continue...
960:1388:S 18 Jun 2021 09:09:04.084 * Partial resynchronization not possible (no cached master)
960:1388:S 18 Jun 2021 09:09:04.087 * Replica 10.0.0.7:6379 asks for synchronization
960:1388:S 18 Jun 2021 09:09:04.087 * Full resync requested by replica 10.0.0.7:6379
960:1388:S 18 Jun 2021 09:09:04.087 * Delay next BGSAVE for diskless SYNC
960:1388:S 18 Jun 2021 09:09:10.035 * Starting BGSAVE for SYNC with target: replicas sockets
960:1388:S 18 Jun 2021 09:09:10.042 * Background RDB transfer started by pid 3814859
960:1388:S 18 Jun 2021 09:09:10.117 * Full resync from master: 640f9e48636076058722301cc52ea8a21bc8e450:453896462998
960:1388:S 18 Jun 2021 09:09:10.118 * Discarding previously cached master state.
960:1388:S 18 Jun 2021 09:09:10.122 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
960:1388:S 18 Jun 2021 09:09:10.999 * MASTER <-> REPLICA sync: Loading DB in memory
960:1388:S 18 Jun 2021 09:09:11.000 * Replica is about to load the RDB file received from the master, but there is a pending RDB child running. Killing process 3814859 and removing its temp file to avoid any race
3814859:signal-handler (1624000150) Received SIGUSR1 in child, exiting now.
960:1388:S 18 Jun 2021 09:09:11.000 * Loading RDB produced by version 6.0.16
960:1388:S 18 Jun 2021 09:09:11.000 * RDB age 0 seconds
960:1388:S 18 Jun 2021 09:09:11.000 * RDB memory usage when created 95.02 Mb
960:1388:S 18 Jun 2021 09:09:11.018 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
960:1388:S 18 Jun 2021 09:09:11.019 # Background transfer terminated by signal 10
960:1388:S 18 Jun 2021 09:09:11.019 * Streamed RDB transfer with replica 10.0.0.7:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
960:1388:S 18 Jun 2021 09:09:11.386 * MASTER <-> REPLICA sync: Finished with success
960:1388:S 18 Jun 2021 09:09:11.386 # systemd supervision requested, but NOTIFY_SOCKET not found
960:1388:S 18 Jun 2021 09:09:11.386 # systemd supervision requested, but NOTIFY_SOCKET not found
960:1388:S 18 Jun 2021 09:09:30.096 * 10000 changes in 60 seconds. Saving...
960:1388:S 18 Jun 2021 09:09:30.101 * Background saving started by pid 3814875
3814875:1388:C 18 Jun 2021 09:09:30.970 * DB saved on disk
3814875:1388:C 18 Jun 2021 09:09:30.975 * RDB: 5 MB of memory used by copy-on-write
960:1388:S 18 Jun 2021 09:09:31.022 * Background saving terminated with success
960:1388:S 18 Jun 2021 09:10:12.795 # Disconnecting timedout replica: 10.0.0.7:6379
960:1388:S 18 Jun 2021 09:10:12.795 # Connection with replica 10.0.0.7:6379 lost.
960:1389:S 18 Jun 2021 09:10:13.429 * Replica 10.0.0.7:6379 asks for synchronization
960:1389:S 18 Jun 2021 09:10:13.429 * Full resync requested by replica 10.0.0.7:6379
960:1389:S 18 Jun 2021 09:10:13.429 * Delay next BGSAVE for diskless SYNC
960:1388:S 18 Jun 2021 09:10:19.941 * Starting BGSAVE for SYNC with target: replicas sockets
960:1388:S 18 Jun 2021 09:10:19.948 * Background RDB transfer started by pid 3815442
3815442:1388:C 18 Jun 2021 09:10:20.860 * RDB: 5 MB of memory used by copy-on-write
960:1388:S 18 Jun 2021 09:10:20.860 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
960:1388:S 18 Jun 2021 09:10:20.958 * Background RDB transfer terminated with success
960:1388:S 18 Jun 2021 09:10:20.958 * Streamed RDB transfer with replica 10.0.0.7:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
960:1389:S 18 Jun 2021 09:10:22.034 * Synchronization with replica 10.0.0.7:6379 succeeded
960:1388:S 18 Jun 2021 09:10:32.013 * 10000 changes in 60 seconds. Saving...
960:1388:S 18 Jun 2021 09:10:32.018 * Background saving started by pid 3816278
3816278:1388:C 18 Jun 2021 09:10:32.845 * DB saved on disk
3816278:1388:C 18 Jun 2021 09:10:32.850 * RDB: 4 MB of memory used by copy-on-write
960:1388:S 18 Jun 2021 09:10:32.921 * Background saving terminated with success
960:1388:S 18 Jun 2021 09:11:33.101 * 10000 changes in 60 seconds. Saving...
EN

回答 1

Server Fault用户

发布于 2023-04-14 06:08:44

对于答案来说也许为时已晚,但我遇到了一个类似的问题:配置集客户端输出缓冲区限制“从10737418240 10738240 60”移除了这个问题。在你的例子中,我会加上一行

  • 奴隶10737418240 10738240 60

我在堆栈溢出: redis-cluster-unavailable-while-switching-slave-to-master.上找到了这个技巧(顺便说一句:在redis和keydb的代码中替换主从术语不是一个好主意吗?)

票数 0
EN
页面原文内容由Server Fault提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://serverfault.com/questions/1067794

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档