zookeeper

1. 介绍

  Zookeeper是一个高效的分布式协调服务,可以提供配置信息管理、命名、分布式同步、集群管理、数据库切换等服务。它不适合用来存储大量信息,可以用来存储一些配置、发布与订阅等少量信息。Hadoop、Storm、消息中间件、RPC服务框架、分布式数据库同步系统,这些都是Zookeeper的应用场景。 Zookeeper集群中节点个数一般为奇数个(>=3),若集群中Master挂掉,剩余节点个数在半数以上时,就可以推举新的主节点,继续对外提供服务。

https://zookeeper.apache.org/

  客户端发起事务请求,事务请求的结果在整个Zookeeper集群中所有机器上的应用情况是一致的。不会出现集群中部分机器应用了该事务,而存在另外一部分集群中机器没有应用该事务的情况。在Zookeeper集群中的任何一台机器,其看到的服务器的数据模型是一致的。Zookeeper能够保证客户端请求的顺序,每个请求分配一个全局唯一的递增编号,用来反映事务操作的先后顺序。Zookeeper将全量数据保存在内存中,并直接服务于所有的非事务请求,在以读操作为主的场景中性能非常突出。

  Zookeeper使用的数据结构为树形结构,根节点为"/"。Zookeeper集群中的节点,根据其身份特性分为leader、follower、observer。leader负责客户端writer类型的请求;follower负责客户端reader类型的请求,并参与leader选举;observer是特殊的follower,可以接收客户端reader请求,但是不会参与选举,可以用来扩容系统支撑能力,提高读取速度。

  Zookeeper是一个基于观察者模式设计的分布式服务管理框架,负责存储和管理相关数据,接收观察者的注册。一旦这些数据的状态发生变化,zookeeper就负责通知那些已经在zookeeper集群进行注册并关心这些状态发生变化的观察者,以便观察者执行相关操作。

  Zookeeper使用的是ZAB原子消息广播协议,节点之间的一致性算法为Paxos,能够保障分布式环境中数据的一致性。分布式场景下高可用是Zookeeper的特性,可以采用第三方客户端的实现,即Curator框架。

  在Zookeeper节点个数(奇数)为3个。Zookeeper默认对外提供服务的端口号2181 。Zookeeper集群内部3个节点之间通信默认使用2888:3888

2. 集群模式

2.1 单机模式

  在zoo.cfg中只配置一个server.id就是单机模式了。 这种模式下,如果当前主机宕机,那么所有依赖于当前zookeeper服务工作的其他服务器都不能在进行正常工作,这种事件称为单节点故障。所以这种模式一般用在测试环境。

单机模式部署
root@leco:/usr/local/src# ls
scala-2.11.8.tgz  tomcat  zookeeper-3.4.5.tar.gz
root@leco:/usr/local/src# tar xf  zookeeper-3.4.5.tar.gz  -C /usr/local
root@leco:/usr/local/src# cd /usr/local
root@leco:/usr/local# ln -sf zookeeper-3.4.5 zookeeper
root@leco:/usr/local# cd zookeeper
root@leco:/usr/local/zookeeper# ls
bin          contrib          ivy.xml      README_packaging.txt  zookeeper-3.4.5.jar
build.xml    dist-maven       lib          README.txt            zookeeper-3.4.5.jar.asc
CHANGES.txt  docs             LICENSE.txt  recipes               zookeeper-3.4.5.jar.md5
conf         ivysettings.xml  NOTICE.txt   src                   zookeeper-3.4.5.jar.sha1
root@leco:/usr/local/zookeeper# mkdir zkData
root@leco:/usr/local/zookeeper# cd zkData/
root@leco:/usr/local/zookeeper/zkData# PWD
PWD:未找到命令
root@leco:/usr/local/zookeeper/zkData# pwd
/usr/local/zookeeper/zkData
root@leco:/usr/local/zookeeper/zkData# cd /usr/local/zookeeper/conf
root@leco:/usr/local/zookeeper/conf# ls
configuration.xsl  log4j.properties  zoo_sample.cfg
root@leco:/usr/local/zookeeper/conf# cp zoo_sample.cfg zoo.cfg 
root@leco:/usr/local/zookeeper/conf# sed -i 's@dataDir=/tmp/zookeeper@dataDir=/usr/local/zookeeper/zkData@g' zoo.c
fg root@leco:/usr/local/zookeeper/conf# egrep -v '#|^$' zoo.cfg 
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/zkData
clientPort=2181

启动ZK服务
root@leco:/usr/local/zookeeper/bin# ./zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
root@leco:/usr/local/zookeeper/bin# ./zkServer.sh status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: standalone
root@leco:/usr/local/zookeeper/bin# ./zkCli.sh
Connecting to localhost:2181
2019-08-07 11:19:31,442 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.5-139209
0, built on 09/30/2012 17:52 GMT2019-08-07 11:19:31,446 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=localhost
2019-08-07 11:19:31,446 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_40
2019-08-07 11:19:31,446 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-08-07 11:19:31,447 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/jdk1.8.
0_40/jre2019-08-07 11:19:31,447 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/local/zoo
keeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.5.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf:.:/usr/lib/jvm/jdk1.8.0_40/lib:/usr/lib/jvm/jdk1.8.0_40/jre/lib2019-08-07 11:19:31,447 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/pa
ckages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib2019-08-07 11:19:31,447 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-08-07 11:19:31,447 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2019-08-07 11:19:31,448 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2019-08-07 11:19:31,448 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2019-08-07 11:19:31,448 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=4.15.0-47-generic
2019-08-07 11:19:31,448 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2019-08-07 11:19:31,448 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2019-08-07 11:19:31,449 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/usr/local/zookeeper-
3.4.5/bin2019-08-07 11:19:31,450 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localho
st:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@5c29bfdWelcome to ZooKeeper!
2019-08-07 11:19:31,480 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening sock
et connection to server localhost/192.168.5.110:2181. Will not attempt to authenticate using SASL (unknown error)JLine support is enabled
2019-08-07 11:19:31,562 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket conne
ction established to localhost/192.168.5.110:2181, initiating session[zk: localhost:2181(CONNECTING) 0] 2019-08-07 11:19:31,589 [myid:] - INFO  [main-SendThread(localhost:2181):Client
Cnxn$SendThread@1207] - Session establishment complete on server localhost/192.168.5.110:2181, sessionid = 0x16c6a1676db0000, negotiated timeout = 30000
WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] create /cmz caimengzhi
Created /cmz
[zk: localhost:2181(CONNECTED) 2] get /cmz
caimengzhi
cZxid = 0x2
ctime = Wed Aug 07 11:19:43 CST 2019
mZxid = 0x2
mtime = Wed Aug 07 11:19:43 CST 2019
pZxid = 0x2
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 10
numChildren = 0

2.2 伪分布式

  在zoo.cfg中配置多个server.id,其中ip都是当前机器,而端口各不相同,启动时就是伪集群模式了。这种模式和单机模式产生的问题是一样的。这种模式也是用在测试环境中。

2.3 完全分布式

  多台机器各自配置zoo.cfg文件,将各自互相加入服务器列表,上面搭建的集群就是这种完全分布式。这种模式是真实生产环境中使用的zookeeper集群模式。

3. 搭建

3.1 环境准备

序列号 IP地址 主机名 角色 安装软件
1 192.168.186.10 master hadoop master zookeeper
2 192.168.186.11 slave1 hadoop slave1 zookeeper
3 192.168.186.12 slave2 hadoop slave2 zookeeper

仍然是这三台机器,期中zk集群中没有固定的leader和follower角色,竞选产生

3.2 安装jdk

  之前安装了,省略。

[root@slave1 ~]# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

[root@slave1 ~]# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

[root@slave2 ~]# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

3.3 部署

  zookeeper部署是相对比较简单的,直接解压配置,启动即可。

[root@master sbin]# cd /usr/local/src/
[root@master src]# ls
hadoop-2.6.5.tar.gz  jdk-8u172-linux-x64.tar.gz
[root@master src]# rz -E
rz waiting to receive.
[root@master src]# ls
hadoop-2.6.5.tar.gz  jdk-8u172-linux-x64.tar.gz  zookeeper-3.4.5.tar.gz
[root@master src]# tar xf zookeeper-3.4.5.tar.gz 
[root@master src]# ls
hadoop-2.6.5.tar.gz  jdk-8u172-linux-x64.tar.gz  zookeeper-3.4.5  zookeeper-3.4.5.tar.gz

3.4 配置

  进入conf目录,复制zoo-sample.cfg重命名为zoo.cfg,通过修改zoo.cfg来对zookeeper进行配置。这个名字固定写死,因为zookeeper启动会检查这个文件,根据这个配置文件里的信息来启动服务。

[root@master zookeeper-3.4.5]# cd conf/
[root@master conf]# ls
configuration.xsl  log4j.properties  zoo_sample.cfg
[root@master conf]# cp zoo_sample.cfg zoo.cfg 
  此文件中需要修改以下两处,

  • dataDir:指定zookeeper将数据保存在哪个目录下,如果不修改,默认在/tmp下,这个目录下的数据有可能会在磁盘空间不足或服务器重启时自动被linux清理,所以一定要修改这个地址。按个人习惯将其修改为自己的管理目录。

完全分布式:多台机器各自配置。

[root@master conf]# egrep -v '#|^$' zoo.cfg 
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/zkData
clientPort=2181
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888

dataDir路径决定你的myid文件位置。也就是myid文件存放在dataDir路径下,否则启动不了

配置文件说明

1. 端口
    2181:对cline端提供服务
    3888:选举leader使用
    2888:集群内机器通讯使用(Leader监听此端口)
2. 分布式配置
server.3=slave2:2888:3888
       |    |    |    |____ 选举leader使用
       |    |    |_________ 集群内机器通讯使用
       |    |______________ 机器的ip地址或者域名
       |___________________ 机器编号也就是myid里面内容,区别是哪个机器
3. 其他
    tickTime  - CS通信心跳时间
        tick翻译成中文的话就是滴答滴答的意思,连起来就是滴答滴答的时间,寓意心跳间隔,单位是毫秒,系统默认是2000毫秒,也就是间隔两秒心跳一次。
        tickTime的意义:客户端与服务器或者服务器与服务器之间维持心跳,也就是每个tickTime时间就会发送一次心跳。通过心跳不仅能够用来监听机器的工作状态,还可以通过心跳来控制Flower跟Leader的通信时间,默认情况下FL的会话时常是心跳间隔的两倍。

    initLimit
        集群中的follower服务器(F)与leader服务器(L)之间初始连接时能容忍的最多心跳数(tickTime的数量)。

    syncLimit
        集群中flower服务器(F)跟leader(L)服务器之间的请求和答应最多能容忍的心跳数。   

    dataDir
        该属性对应的目录是用来存放myid信息跟一些版本,日志,跟服务器唯一的ID信息等。

  在配置文件末尾加上这三行,ip填写自己规划的ip即可,zookeeper服务默认的端口号为2888和3888,也可将/etc/hosts文件添加主机和ip映射,将此处的ip写成主机名称。

说明:2888原子广播端口,3888选举端口,zookeeper有几个节点,就配置几个server。

  • myid

  到之前配置的zookeeper数据文件所在的目录下生成一个文件叫myid,其中写上一个数字表明当前机器是哪一个编号的机器。

[root@master zookeeper]# mkdir zkData
[root@master zookeeper]# cd zkData
[root@master zkData]# echo '1' > myid

注意:文件名称必须是myid,文件内容只需要一个数字即服务器列表中当前服务器的编号,反正随便你写, 推荐当前服务的最后一位

3.5 分发

  将master的zookeeper包,分发到另外两个机器上。

[root@master src]# mv zookeeper-3.4.5 /usr/local/
[root@master src]# cd /usr/local/
[root@master local]# ln -sf zookeeper-3.4.5 zookeeper
[root@master local]# rsync -az zookeeper zookeeper-3.4.5 slave1:/usr/local/
[root@master local]# rsync -az zookeeper zookeeper-3.4.5 slave2:/usr/local/
配置myid文件
[root@slave1 ~]# cd /usr/local/zookeeper
[root@slave1 zookeeper]# echo '2' > myid

[root@slave2 ~]# cd /usr/local/zookeeper
[root@slave2 zookeeper]# echo '3' > myid

3.6 启动

  启动zookeeper的各种命令操作如下,可以使用绝对路径操作这些命令,也可使用相对路径操作这些命令,相对路径需要进到zookeeper服务的bin目录进行操作。

#启动ZK服务:     bin/zkServer.sh start
#停止ZK服务:     bin/zkServer.sh stop
#重启ZK服务:     bin/zkServer.sh restart
#查看ZK服务状态: bin/zkServer.sh status
详细过程
[root@master zookeeper]# bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@slave1 zookeeper]# bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@slave2 zookeeper]# bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
查看状态
[root@master zookeeper]# ./bin/zkServer.sh  status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower

[root@slave1 zookeeper]# ./bin/zkServer.sh  status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader

[root@slave2 zookeeper]# ./bin/zkServer.sh  status
JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower

follower就是从,leader就是主

到此我完全分布式zookeeper服务搭建完成。

3.7 添加环境变量

[root@master ~]# tail -2 /etc/profile
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
[root@master ~]# source /etc/profile

4. 使用

4.1 链接

[root@master ~]# zkCli.sh 
Connecting to localhost:2181
2019-06-22 17:31:40,434 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=master
2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_172
2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/local/jdk1.8.0_172/jre
2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/local/zookeeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/
zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.5.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf:.:/usr/local/jdk1.8.0_172/lib/dt.jar:/usr/local/jdk1.8.0_172/lib/tools.jar2019-06-22 17:31:40,438 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-957.el7.x86_64
2019-06-22 17:31:40,439 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2019-06-22 17:31:40,440 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2019-06-22 17:31:40,440 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root
2019-06-22 17:31:40,442 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@5
31d72caWelcome to ZooKeeper!
2019-06-22 17:31:40,484 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate 
using SASL (unknown error)JLine support is enabled
2019-06-22 17:31:40,579 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket connection established to localhost/127.0.0.1:2181, initiating session
[zk: localhost:2181(CONNECTING) 0] 2019-06-22 17:31:40,612 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server localhost/127.0.0.
1:2181, sessionid = 0x16b7e7bb3dc0000, negotiated timeout = 30000
WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] 
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]

zookeeper命令必须要加参数 ,且同时zookeeper要使用绝对路径

4.2 其他操作

创建
create cmz 'caimengzhi'

查看
get /cmz

修改
set /cmz 'cmz'

ls /

详细操作

[zk: localhost:2181(CONNECTED) 1] create cmz 'caimengzhi'
Command failed: java.lang.IllegalArgumentException: Path must start with / character
可见必须使用绝对路径

[zk: localhost:2181(CONNECTED) 2] create /cmz 'caimengzhi'
Created /cmz
[zk: localhost:2181(CONNECTED) 3] get /cmz
'caimengzhi'
cZxid = 0x100000002
ctime = Sat Jun 22 17:35:07 CST 2019
mZxid = 0x100000002
mtime = Sat Jun 22 17:35:07 CST 2019
pZxid = 0x100000002
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 12

[zk: localhost:2181(CONNECTED) 4] set /cmz 'cmz'
cZxid = 0x100000002
ctime = Sat Jun 22 17:35:07 CST 2019
mZxid = 0x100000003
mtime = Sat Jun 22 17:36:33 CST 2019
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
[zk: localhost:2181(CONNECTED) 5] get /cmz
'cmz'
cZxid = 0x100000002
ctime = Sat Jun 22 17:35:07 CST 2019
mZxid = 0x100000003
mtime = Sat Jun 22 17:36:33 CST 2019
pZxid = 0x100000002
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 0
[zk: localhost:2181(CONNECTED) 6] ls / 
[cmz, zookeeper]