介绍
使用群晖Docker 搭建Hbase分布式服务
命名分别为:
hdfs-master
hdfs-slave1
hdfs-slave2
hdfs-slave3
目录
前言
- rsync是个好东西,分布式同步太好用了,在网上Copy了一份xsync脚本,如下:
#!/bin/sh # 获取输入参数个数,如果没有参数,直接退出 pcount=$# if((pcount!=4)); then echo Usage: $0 filename servername startno endno exit; fi # 获取文件名称 p1=$1 fname=`basename $p1` echo fname=$fname # 获取上级目录到绝对路径 pdir=`cd -P $(dirname $p1); pwd` echo pdir=$pdir # 获取当前用户名称 user=`whoami` # 获取hostname及起止号 slave=$2 startline=$3 endline=$4 # 循环 for((host=$startline; host<=$endline; host++)); do echo $pdir/$fname $user@$slave$host:$pdir echo ==================$slave$host================== rsync -rvl $pdir/$fname $user@$slave$host:$pdir done
- 使用方式:
xsync filename servername startno endno | | | | |.-----终止主机号 | | | | | | | .--------------起始主机号 | | | | | .----------------------主机名前缀,需要提前认证公钥(ssh-copy-id serverName) | | | .----------------------------------可以是目录或者文件名 | .----------------------------------------命令,最好将xsync脚本加入环境变量 #主机命名规则为 servername+no 如hdfs-slave1 hdfs-slave2
- 使用xsync前需要安装rsync
- xcall用于分布式命令执行,脚本如下:
#!/bin/bash #获取输入参数个数,如果没有参数,直接退出 pcount=$# if((pcount<4)); then echo Usage: $0 servername startno endno command exit; fi slave=$1 startline=$2 endline=$3 tmp=($@) params=${tmp[@]:3} echo $params user=`whoami` for((host=$startline; host<=$endline; host++)); do echo ==================$slave$host================== ssh $user@$slave$host "source /etc/profile;$params" done
- 使用方式:
xcall servername startno endno command | | | | |.-----远程调用命令 | | | | | | | .--------------终止主机号 | | | | | .----------------------起始主机号 | | | .----------------------------------主机名前缀,需要提前认证公钥(ssh-copy-id serverName) | .----------------------------------------命令,最好将xsync脚本加入环境变量 #主机命名规则为 servername+no 如hdfs-slave1 hdfs-slave2
zookeeper分布式部署
- 解压zookeeper安装包
tar -zxvf apache-zookeeper-3.7.0-bin.tar.gz -C /opt/software
- 修改配置,配置文件路径为 apache-zookeeper-3.7.0-bin/conf/
cp conf/zoo_sample.cfg zoo.cfg vi conf/zoo.cfg ###以下为zoo.cfg中修改的内容 dataDir=/zookeeper/data #dataDir属性设置zookeeper的数据文件存放的目录 server.1=hdfs-master:2888:3888#增加 server.2=hdfs-slave1:2888:3888#增加 server.3=hdfs-slave2:2888:3888增加 server.4=hdfs-slave3:2888:3888增加 ###端口说明 #1、2181:对client端提供服务 #2、3888:选举leader使用 #3、2888:集群内机器通讯使用(Leader监听此端口)
- 内容分发到其他机器
xsync hdfs-slave 1 3 /zookeeper/
- 设定各个机器的id
echo 1 > /zookeeper/data/myid ssh hdfs-slave1 "echo 2 > /zookeeper/data/myid" ssh hdfs-slave2 "echo 3 > /zookeeper/data/myid" ssh hdfs-slave3 "echo 4 > /zookeeper/data/myid"
- 启动zookeeper
xcall hdfs-slave 1 3 /zookeeper/apache-zookeeper-3.7.0-bin/bin/zkServer.sh start
- zookeeper客户端测试
/zookeeper/apache-zookeeper-3.7.0-bin/bin/zkCli.sh
Hbase部署
- 解压Habse到/hadoop/hbase
tar -zxvf hbase-3.0.0-alpha-1.tar.gz -C /hadoop/hbase/
- 配置conf/hbase-env.sh
###一般ssh登陆只会加载~/.bashrc和~/.profile 而JAVA环境变量一般在/etc/profile或者/etc/bashrc中, ###所以要修改以下配置 export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261#修改JAVA_HOME目录为有效目录
- 配置conf/habse-site.xml
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hdfs-master:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.tmp.dir</name> <value>${env.HBASE_HOME:-.}/tmp</value> </property> <property> <name>hbase.master.port</name> <value>16000</value> </property> <property> <name>hbase.master.info.port</name> <value>16010</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hdfs-master,hdfs-slave1,hdfs-slave2,hdfs-slave3</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/zookeeper/apache-zookeeper-3.7.0-bin/conf</value> </property> <property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property> </configuration>
- 修改regionservers内容如下
hdfs-master hdfs-slave1 hdfs-slave2 hdfs-slave3
- 启动hbase-master
/hadoop/hbase/hbase-3.0.0-alpha-1/bin/hbase-daemon.sh start master
- 启动hbase-regionserver
#启动hdfs前需要先启动hdfs start-dfs.sh xcall hdfs-slave 1 3 /hadoop/hbase/hbase-3.0.0-alpha-1/bin/hbase-daemon.sh start regionserver
- 启动后的效果如下:
[root@hdfs-master jdk]# xcall hdfs-slave 1 3 jps jps ==================hdfs-slave1================== 1714 Jps 738 QuorumPeerMain 1530 HRegionServer 78 DataNode ==================hdfs-slave2================== 1446 Jps 716 QuorumPeerMain 76 DataNode 1262 HRegionServer ==================hdfs-slave3================== 182 QuorumPeerMain 649 HRegionServer 828 Jps 415 DataNode
- 备注
- 群关闭
- bin/stop-hbase.sh
- 群启动 - bin/start-hbase.sh
- 进入Hbase客户端,没反应的话多等一会儿,本来就比较慢bin/hbase shell
- Hbase-shell
- 创建表
hbase(main):002:0> create ‘student’,’info’ - 插入数据到表
hbase(main):003:0> put ‘student’,’1001’,’info:sex’,’male’
hbase(main):004:0> put ‘student’,’1001’,’info:age’,’18’
hbase(main):005:0> put ‘student’,’1002’,’info:name’,’Janna’
hbase(main):006:0> put ‘student’,’1002’,’info:sex’,’female’
hbase(main):007:0> put ‘student’,’1002’,’info:age’,’20’ - 扫描查看表数据
hbase(main):008:0> scan ‘student’
hbase(main):009:0> scan ‘student’,{STARTROW => ‘1001’, STOPROW =>
‘1001’}
hbase(main):010:0> scan ‘student’,{STARTROW => ‘1001’} - 查看表结构
hbase(main):011:0> describe ‘student’ - 更新指定字段的数据
hbase(main):012:0> put ‘student’,’1001’,’info:name’,’Nick’
hbase(main):013:0> put ‘student’,’1001’,’info:age’,’100’ - 查看“指定行”或“指定列族:列”的数据
hbase(main):014:0> get ‘student’,’1001’
hbase(main):015:0> get ‘student’,’1001’,’info:name’ - 统计表数据行数
hbase(main):021:0> count ‘student’ - 删除数据
删除某 rowkey 的全部数据:
hbase(main):016:0> deleteall ‘student’,’1001’
删除某 rowkey 的某一列数据:
hbase(main):017:0> delete ‘student’,’1002’,’info:sex’ - 清空表数据
hbase(main):018:0> truncate ‘student’
提示: 清空表的操作顺序为先 disable,然后再 truncate。 - 删除表
首先需要先让该表为 disable 状态:
hbase(main):019:0> disable ‘student’
然后才能 drop 这个表:
hbase(main):020:0> drop ‘student’
提示: 如果直接 drop 表,会报错: ERROR: Table student is enabled. Disable it first. - 变更表信息
将 info 列族中的数据存放 3 个版本:
hbase(main):022:0> alter ‘student’,{NAME=>’info’,VERSIONS=>3}
hbase(main):022:0> get
‘student’,’1001’,{COLUMN=>’info:name’,VERSIONS=>3} - 查看put的帮助信息
hbase:031:0> help “put”
Put a cell ‘value’ at specified table/row/column and optionally
timestamp coordinates. To put a cell value into table ‘ns1:t1’ or ‘t1’
at row ‘r1’ under column ‘c1’ marked with the time ‘ts1’, do:
hbase> put ‘ns1:t1’, ‘r1’, ‘c1’, ‘value’
hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’
hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1
hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}
hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}
hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1, {VISIBILITY=>’PRIVATE|SECRET’}
The same commands also can be run on a table reference. Suppose you had a reference
t to table ‘t1’, the corresponding command would be:
hbase> t.put ‘r1’, ‘c1’, ‘value’, ts1, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}