比尔云BierYun--阿里云最新优惠活动
阿里云优惠码丨阿里云代金券

redis-3.0.1 sentinel 主从高可用 详细配置

redis-3.0.1 sentinel 主从高可用 详细配置

最近项目上线部署,要求redis作高可用,由于redis cluster还不是特别成熟,就选择了redis sentinel做高可用。redis本身有replication,实现主从备份。结合sentinel可以做主、从自动切换。
生产环境中,一般要求有3个redis节点。但本文为了试验方便,只用了两个节点,一主一从。

部署规划
172.16.203.10 主节点
172.16.203.4 从节点
redis版本为3.0.1

主节点
redis采用源码编译的方式安装,非常简单,解压出来,进入解压目录,执行make就可以了,这里就不再详细介绍了。
下面来看redis.conf需要做的修改。

  1. daemonize yes #让redis后台运行
  2. pidfile /apps/run/redis/redis.pid #指定redis的pid文件存放位置
  3. port 6379 #redis使用端口
  4. logfile “/apps/logs/redis/redis.log” #log文件的位置。如果为空,则默认打印到/dev/null
  5. requirepass 123456 #redis的密码,如果不需要密码验证,则可以不做修改
  6. masterauth 123456 #如果上面设置了redis的密码,则这里必须设置,而且要和他一样。当该节点作为从节点连接主节点时,要用到这个密码和主节点做校验。

启动redis:
src/redis-server redis.conf
查看当前主从状态:
src/redis-cli -h 172.16.203.10 -a 123456 info Replication

  1. # Replication
  2. role:master
  3. connected_slaves:0
  4. master_repl_offset:544693
  5. repl_backlog_active:1
  6. repl_backlog_size:1048576
  7. repl_backlog_first_byte_offset:2
  8. repl_backlog_histlen:544692

可以看到,172.16.203.10为master,当前没有slave。
接下来,就该配置sentinel.conf了:

  1. port 26379 #sentinel使用的端口
  2. daemonize yes #sentinel后台运行。这行配置是添加的
  3. logfile “/apps/logs/redis/sentinel.log” #log文件地址,这行配置是添加的
  4. sentinel monitor mymaster 172.16.203.10 6379 1 #指定master。后面的数字表示,当有几个节点认为主节点down时才认为主节点进入ODOWN状态,就是真正挂了。
  5. sentinel down-after-milliseconds mymaster 5000 #当多久,连接不上节点时,认为被连接节点进入S_DOWN(主观认为它down了);
  6. sentinel failover-timeout mymaster 15000 #这个配置有很多作用。1、重新执行failover的时间是该值的2倍;2、取消一个没更改配置的failover3、failover中等待所有slave更改新的配置的最大时间。
  7. sentinel auth-pass mymaster 123456 #设置校验的密码。如果redis设置了密码,这个一定要设置

要修改的就是上面几项,一定要特别注意sentinel auth-pass这一项,别忘记改 。修改好后,先拷贝一个备份。因为运行过程中,redis会自动修改这个配置。如果之后出了问题,可以通过备份恢复成最开始正确的状态。
启动sentinel
src/redis-sentine sentinel.conf
查看sentinel log:

  1. _._
  2. _.-__ -._
  3. _.- `. `_. -._ Redis 3.0.1 (00000000/0) 64 bit
  4. .- .-`. `\/ _.,_ -._
  5. ( ‘ , .-` | `, ) Running in sentinel mode
  6. |`-._`-…-` __…-.“-._|’` _.-‘| Port: 26379
  7. | `-._ `._ / _.-‘ | PID: 19957
  8. `-._ `-._ `-./ _.-‘ _.-‘
  9. |`-._`-._ `-.__.-‘ _.-‘_.-‘|
  10. | `-._`-._ _.-‘_.-‘ | http://redis.io
  11. `-._ `-._`-.__.-‘_.-‘ _.-
  12. |`-._`-._ `-.__.-‘ _.-‘_.-‘|
  13. | `-._`-._ _.-‘_.-‘ |
  14. `-._ `-._`-.__.-‘_.-‘ _.-‘
  15. `-._ `-.__.-‘ _.-‘
  16. `-._ _.-
  17. `-.__.-‘
  18. 19957:X 12 Dec 13:13:36.746 # Sentinel runid is 6ab6f8abdc3dba4097da202954ecece7bc6d3215
  19. 19957:X 12 Dec 13:13:36.746 # +monitor master mymaster 172.16.203.10 6379 quorum 1

第一行表示当前Sentinel 的id,第二行显示当前的主节点是172.16.203.10 6379
查看下午Sentinel的状态:
src/redis-cli -h 172.16.203.10 -a 123456 -p 26379 info Sentinel

  1. # Sentinel
  2. sentinel_masters:1
  3. sentinel_tilt:0
  4. sentinel_running_scripts:0
  5. sentinel_scripts_queue_length:0
  6. master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=0,sentinels=1

从节点
redis.conf的配置与主节点只有一点不同,增加下面一行:
slaveof 172.16.203.10 6379
启动redis
src/redis-server redis.conf
在从节点查看主、从状态:
src/redis-cli -h 172.16.203.4 -a 123456 info Replication

  1. # Replication
  2. role:slave
  3. master_host:172.16.203.10
  4. master_port:6379
  5. master_link_status:up
  6. master_last_io_seconds_ago:0
  7. master_sync_in_progress:0
  8. slave_repl_offset:617956
  9. slave_priority:100
  10. slave_read_only:1
  11. connected_slaves:0
  12. master_repl_offset:0
  13. repl_backlog_active:0
  14. repl_backlog_size:1048576
  15. repl_backlog_first_byte_offset:0
  16. repl_backlog_histlen:0

可以看到当前节点为slave。
sentinel的配置和主节点保持一致就可以,启动sentinel:
src/redis-sentine sentinel.conf
查看sentinel log:

  1. 12190:X 12 Dec 13:21:38.658 # Sentinel runid is 270f322d0f3f8605b92902417e499cedc8866163
  2. 12190:X 12 Dec 13:21:38.658 # +monitor master mymaster 172.16.203.10 6379 quorum 1
  3. 12190:X 12 Dec 13:21:38.659 * +slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
  4. 12190:X 12 Dec 13:21:39.609 * +sentinel sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379

查看下sentinel状态:
src/redis-cli -h 172.16.203.4 -a 123456 -p 26379 info Sentinel

  1. # Sentinel
  2. sentinel_masters:1
  3. sentinel_tilt:0
  4. sentinel_running_scripts:0
  5. sentinel_scripts_queue_length:0
  6. master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=1,sentinels=2

可以看出,当前有两个sentinels,一个slave。
到此,redis主从高可用就算配置结束了,下面开始验证

验证
1、从节点down机,redis、sentinel都挂了,关注主节点sentinel的log
+sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
+sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
2、重新启动从节点上的redis、sentinel
-sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
-sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
-dup-sentinel master mymaster 172.16.203.10 6379 #duplicate of 172.16.203.4:26379 or 0b0bf0cddcf7aa5b518a8a62c65188f9c4a1ecaf
+sentinel sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
可以看到,sentinel 的id变了,自动更新了sentinel配置文件中的相应配置。
查看主、从情况:
src/redis-cli -h 172.16.203.10 -a 123456 -p 6379 info Replication

  1. # Replication
  2. role:master
  3. connected_slaves:1
  4. slave0:ip=172.16.203.4,port=6379,state=online,offset=862487,lag=1
  5. master_repl_offset:862642
  6. repl_backlog_active:1
  7. repl_backlog_size:1048576
  8. repl_backlog_first_byte_offset:2
  9. repl_backlog_histlen:862641

3、主节点down机
先停掉redis看主节点sentinel:

  1. 19957:X 12 Dec 15:10:19.207 # +sdown master mymaster 172.16.203.10 6379
  2. 19957:X 12 Dec 15:10:19.207 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
  3. 19957:X 12 Dec 15:10:19.207 # +new-epoch 1
  4. 19957:X 12 Dec 15:10:19.207 # +try-failover master mymaster 172.16.203.10 6379
  5. 19957:X 12 Dec 15:10:19.208 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
  6. 19957:X 12 Dec 15:10:19.211 # 172.16.203.4:26379 voted for 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
  7. 19957:X 12 Dec 15:10:19.275 # +elected-leader master mymaster 172.16.203.10 6379
  8. 19957:X 12 Dec 15:10:19.275 # +failover-state-select-slave master mymaster 172.16.203.10 6379
  9. 19957:X 12 Dec 15:10:19.375 # +selected-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
  10. 19957:X 12 Dec 15:10:19.375 * +failover-state-send-slaveof-noone slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
  11. 19957:X 12 Dec 15:10:19.447 * +failover-state-wait-promotion slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
  12. 19957:X 12 Dec 15:10:20.216 # +promoted-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
  13. 19957:X 12 Dec 15:10:20.216 # +failover-state-reconf-slaves master mymaster 172.16.203.10 6379
  14. 19957:X 12 Dec 15:10:20.297 # +failover-end master mymaster 172.16.203.10 6379
  15. 19957:X 12 Dec 15:10:20.297 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
  16. 19957:X 12 Dec 15:10:20.298 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
  17. 19957:X 12 Dec 15:10:25.350 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379

redis主节点挂了后,首先重新选择leader(注意区分leader和master,leader对应sentinel,master对应redis),可以看到,leader选择为172.16.203.10,之后他开始选择master:
failover-state-select-slave
下面表示找到了合适的slave:172.16.203.4 6379
selected-slave 172.16.203.4 6379
然后更改选中的这个节点的配置文件
failover-state-send-slaveof-noone
等待其他sentinel的确认:
failover-state-wait-promotion
确认成功:
promoted-slave
开始对slaves进行reconfig操作。
failover-state-reconf-slaves
failover结束
failover-end
监听新的master
switch-master

看看从节点的sentinel日志:

  1. 24199:X 12 Dec 15:10:19.210 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
  2. 24199:X 12 Dec 15:10:19.249 # +sdown master mymaster 172.16.203.10 6379
  3. 24199:X 12 Dec 15:10:19.249 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
  4. 24199:X 12 Dec 15:10:19.249 # Next failover delay: I will not start a failover before Sat Dec 12 15:10:50 2015
  5. 24199:X 12 Dec 15:10:20.299 # +config-update-from sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
  6. 24199:X 12 Dec 15:10:20.299 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
  7. 24199:X 12 Dec 15:10:20.299 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
  8. 24199:X 12 Dec 15:10:25.315 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379

再停掉master的sentinel
+sdown sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.4 6379

问题
1、停掉一个sentinel,然后再停掉master,sentinel一直这个状态:

 

  1. 18430:X 12 Dec 11:36:37.949 # +new-epoch 68
  2. 18430:X 12 Dec 11:36:37.949 # +try-failover master mymaster 127.0.0.1 6380
  3. 18430:X 12 Dec 11:36:39.179 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 68
  4. 18430:X 12 Dec 11:36:48.077 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
  5. 18430:X 12 Dec 11:36:48.177 # Next failover delay: I will not start a failover before Sat Dec 12 11:42:38 2015
  6. 18430:X 12 Dec 11:42:38.057 # +new-epoch 69
  7. 18430:X 12 Dec 11:42:38.057 # +try-failover master mymaster 127.0.0.1 6380
  8. 18430:X 12 Dec 11:42:38.106 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 69
  9. 18430:X 12 Dec 11:42:48.443 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
  10. 18430:X 12 Dec 11:42:48.544 # Next failover delay: I will not start a failover before Sat Dec 12 11:48:38 2015

这里要提下sentinel的leader选举流程:每个发现主服务器进入客观下线的sentinel,在发送is-master-down-by-addr询问的时候,
会带上自己的run id,要求其他sentinel将自己设置为局部领头sentinel。局部领头sentinel是先到先得:只有第一个发送is-master-down-by-addr询问的sentinel被设为局部领头sentinel,后续的都会被拒绝。如果有某个sentinel被**半数以上**sentinel设置局部领头sentinel,则这个sentinel成为领头sentinel。
注意半数以上 ,虽然我们停掉了一个sentinel,但由于配置文件纪录了他,所以sentinel数量还是2。半数以上也就是2,但实际我们只有一个sentinel,因此永远也选不出leader,也就不会进行failover。
———————
作者:sdlyjzh
来源:CSDN
原文:https://blog.csdn.net/sdlyjzh/article/details/50274499
版权声明:本文为博主原创文章,转载请附上博文链接!

未经允许不得转载:阿里云代金券 » redis-3.0.1 sentinel 主从高可用 详细配置

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

强烈推荐

高性能SSD云服务器ECS抗攻击,高可用云数据库RDS