Hbase分布式部署

Haddoop+Zookeeper+Hbase

Posted by AnAn on October 12, 2021

介绍

使用群晖Docker 搭建Hbase分布式服务
命名分别为:

hdfs-master     
hdfs-slave1     
hdfs-slave2     
hdfs-slave3     

目录

前言

  1. rsync是个好东西,分布式同步太好用了,在网上Copy了一份xsync脚本,如下:
      #!/bin/sh  
      # 获取输入参数个数,如果没有参数,直接退出     
      pcount=$#     
      if((pcount!=4)); then     
       echo Usage: $0 filename servername startno endno     
       exit;     
      fi     
      # 获取文件名称     
      p1=$1     
      fname=`basename $p1`     
      echo fname=$fname  
      # 获取上级目录到绝对路径     
      pdir=`cd -P $(dirname $p1); pwd`     
      echo pdir=$pdir     
      # 获取当前用户名称     
      user=`whoami`     
      # 获取hostname及起止号     
      slave=$2     
      startline=$3     
      endline=$4     
      # 循环     
      for((host=$startline; host<=$endline; host++)); do     
       echo $pdir/$fname $user@$slave$host:$pdir     
       echo ==================$slave$host==================     
       rsync -rvl $pdir/$fname $user@$slave$host:$pdir     
      done     
    
  • 使用方式:
    xsync filename servername startno endno     
      |     |           |       |       |.-----终止主机号     
      |     |           |       |     
      |     |           |       .--------------起始主机号     
      |     |           |     
      |     |           .----------------------主机名前缀,需要提前认证公钥(ssh-copy-id serverName)     
      |     |     
      |     .----------------------------------可以是目录或者文件名               
      |     
      .----------------------------------------命令,最好将xsync脚本加入环境变量     
      #主机命名规则为 servername+no 如hdfs-slave1     
                                                                           hdfs-slave2     
    
  • 使用xsync前需要安装rsync
  • xcall用于分布式命令执行,脚本如下:
    #!/bin/bash     
    #获取输入参数个数,如果没有参数,直接退出     
    pcount=$#     
    if((pcount<4)); then     
        echo Usage: $0 servername startno endno command     
        exit;     
    fi     
    slave=$1     
    startline=$2     
    endline=$3     
    tmp=($@)     
    params=${tmp[@]:3}     
    echo $params     
    user=`whoami`     
    for((host=$startline; host<=$endline; host++)); do     
      echo ==================$slave$host==================     
      ssh $user@$slave$host "source /etc/profile;$params"     
    done     
    
  • 使用方式:
    xcall servername  startno endno   command     
      |     |           |       |       |.-----远程调用命令     
      |     |           |       |     
      |     |           |       .--------------终止主机号     
      |     |           |     
      |     |           .----------------------起始主机号     
      |     |     
      |     .----------------------------------主机名前缀,需要提前认证公钥(ssh-copy-id serverName)               
      |     
      .----------------------------------------命令,最好将xsync脚本加入环境变量     
      #主机命名规则为 servername+no 如hdfs-slave1     
                                   hdfs-slave2     
    

zookeeper分布式部署

  1. 解压zookeeper安装包
      tar -zxvf  apache-zookeeper-3.7.0-bin.tar.gz -C /opt/software     
    
  2. 修改配置,配置文件路径为 apache-zookeeper-3.7.0-bin/conf/
      cp conf/zoo_sample.cfg zoo.cfg     
      vi conf/zoo.cfg     
      ###以下为zoo.cfg中修改的内容     
      dataDir=/zookeeper/data #dataDir属性设置zookeeper的数据文件存放的目录     
      server.1=hdfs-master:2888:3888#增加     
      server.2=hdfs-slave1:2888:3888#增加     
      server.3=hdfs-slave2:2888:3888增加     
      server.4=hdfs-slave3:2888:3888增加     
      ###端口说明     
      #1、2181:对client端提供服务      
      #2、3888:选举leader使用           
      #3、2888:集群内机器通讯使用(Leader监听此端口)     
    
  3. 内容分发到其他机器
      xsync hdfs-slave 1 3 /zookeeper/     
    
  4. 设定各个机器的id
      echo 1 > /zookeeper/data/myid     
      ssh hdfs-slave1 "echo 2 > /zookeeper/data/myid"     
      ssh hdfs-slave2 "echo 3 > /zookeeper/data/myid"     
      ssh hdfs-slave3 "echo 4 > /zookeeper/data/myid"     
    
  5. 启动zookeeper
      xcall hdfs-slave 1 3 /zookeeper/apache-zookeeper-3.7.0-bin/bin/zkServer.sh start     
    
  6. zookeeper客户端测试
      /zookeeper/apache-zookeeper-3.7.0-bin/bin/zkCli.sh     
    

Hbase部署

  1. 解压Habse到/hadoop/hbase
      tar -zxvf hbase-3.0.0-alpha-1.tar.gz -C /hadoop/hbase/     
    
  2. 配置conf/hbase-env.sh
      ###一般ssh登陆只会加载~/.bashrc和~/.profile  而JAVA环境变量一般在/etc/profile或者/etc/bashrc中,     
      ###所以要修改以下配置     
      export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261#修改JAVA_HOME目录为有效目录     
    
  3. 配置conf/habse-site.xml
     <configuration>      
         <property>                                                                                                                                
             <name>hbase.rootdir</name>                                                                                                            
             <value>hdfs://hdfs-master:9000/hbase</value>                                                                                          
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.cluster.distributed</name>                                                                                                
             <value>true</value>                                                                                                                   
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.tmp.dir</name>                                                                                                            
             <value>${env.HBASE_HOME:-.}/tmp</value>                                                                                               
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.master.port</name>                                                                                                        
             <value>16000</value>                                                                                                                  
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.master.info.port</name>                                                                                                   
             <value>16010</value>                                                                                                                  
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.zookeeper.quorum</name>                                                                                                   
             <value>hdfs-master,hdfs-slave1,hdfs-slave2,hdfs-slave3</value>                                                                        
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.zookeeper.property.dataDir</name>                                                                                         
             <value>/zookeeper/apache-zookeeper-3.7.0-bin/conf</value>                                                                             
           </property>                                                                                                                             
         <property>                                                                                                                                
             <name>hbase.unsafe.stream.capability.enforce</name>                                                                                   
             <value>false</value>                                                                                                                  
           </property>                                                                                                                             
     </configuration>      
    
  4. 修改regionservers内容如下
      hdfs-master                                                                                                                               
      hdfs-slave1                                                                                                                               
      hdfs-slave2                                                                                                                               
      hdfs-slave3      
    
  5. 启动hbase-master
      /hadoop/hbase/hbase-3.0.0-alpha-1/bin/hbase-daemon.sh start master     
    
  6. 启动hbase-regionserver
      #启动hdfs前需要先启动hdfs start-dfs.sh      
      xcall hdfs-slave 1 3 /hadoop/hbase/hbase-3.0.0-alpha-1/bin/hbase-daemon.sh start regionserver     
    
  7. 启动后的效果如下:
      [root@hdfs-master jdk]# xcall hdfs-slave 1 3 jps                                                                                          
      jps                                                                                                                                       
      ==================hdfs-slave1==================                                                                                           
      1714 Jps                                                                                                                                  
      738 QuorumPeerMain                                                                                                                        
      1530 HRegionServer                                                                                                                        
      78 DataNode                                                                                                                               
      ==================hdfs-slave2==================                                                                                           
      1446 Jps                                                                                                                                  
      716 QuorumPeerMain                                                                                                                        
      76 DataNode                                                                                                                               
      1262 HRegionServer                                                                                                                        
      ==================hdfs-slave3==================                                                                                           
      182 QuorumPeerMain                                                                                                                        
      649 HRegionServer                                                                                                                         
      828 Jps                                                                                                                                   
      415 DataNode        
    
  8. 备注
    • 群关闭
    • bin/stop-hbase.sh
      - 群启动
    • bin/start-hbase.sh
      - 进入Hbase客户端,没反应的话多等一会儿,本来就比较慢
       bin/hbase shell     
      
  9. Hbase-shell
  10. 创建表
    hbase(main):002:0> create ‘student’,’info’
  11. 插入数据到表
    hbase(main):003:0> put ‘student’,’1001’,’info:sex’,’male’
    hbase(main):004:0> put ‘student’,’1001’,’info:age’,’18’
    hbase(main):005:0> put ‘student’,’1002’,’info:name’,’Janna’
    hbase(main):006:0> put ‘student’,’1002’,’info:sex’,’female’
    hbase(main):007:0> put ‘student’,’1002’,’info:age’,’20’
  12. 扫描查看表数据
    hbase(main):008:0> scan ‘student’
    hbase(main):009:0> scan ‘student’,{STARTROW => ‘1001’, STOPROW =>
    ‘1001’}
    hbase(main):010:0> scan ‘student’,{STARTROW => ‘1001’}
  13. 查看表结构
    hbase(main):011:0> describe ‘student’
  14. 更新指定字段的数据
    hbase(main):012:0> put ‘student’,’1001’,’info:name’,’Nick’
    hbase(main):013:0> put ‘student’,’1001’,’info:age’,’100’
  15. 查看“指定行”或“指定列族:列”的数据
    hbase(main):014:0> get ‘student’,’1001’
    hbase(main):015:0> get ‘student’,’1001’,’info:name’
  16. 统计表数据行数
    hbase(main):021:0> count ‘student’
  17. 删除数据
    删除某 rowkey 的全部数据:
    hbase(main):016:0> deleteall ‘student’,’1001’
    删除某 rowkey 的某一列数据:
    hbase(main):017:0> delete ‘student’,’1002’,’info:sex’
  18. 清空表数据
    hbase(main):018:0> truncate ‘student’
    提示: 清空表的操作顺序为先 disable,然后再 truncate。
  19. 删除表
    首先需要先让该表为 disable 状态:
    hbase(main):019:0> disable ‘student’
    然后才能 drop 这个表:
    hbase(main):020:0> drop ‘student’
    提示: 如果直接 drop 表,会报错: ERROR: Table student is enabled. Disable it first.
  20. 变更表信息
    将 info 列族中的数据存放 3 个版本:
    hbase(main):022:0> alter ‘student’,{NAME=>’info’,VERSIONS=>3}
    hbase(main):022:0> get
    ‘student’,’1001’,{COLUMN=>’info:name’,VERSIONS=>3}
  21. 查看put的帮助信息
    hbase:031:0> help “put”
    Put a cell ‘value’ at specified table/row/column and optionally
    timestamp coordinates. To put a cell value into table ‘ns1:t1’ or ‘t1’
    at row ‘r1’ under column ‘c1’ marked with the time ‘ts1’, do:
    hbase> put ‘ns1:t1’, ‘r1’, ‘c1’, ‘value’
    hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’
    hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1
    hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}
    hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}
    hbase> put ‘t1’, ‘r1’, ‘c1’, ‘value’, ts1, {VISIBILITY=>’PRIVATE|SECRET’}
    The same commands also can be run on a table reference. Suppose you had a reference
    t to table ‘t1’, the corresponding command would be:
    hbase> t.put ‘r1’, ‘c1’, ‘value’, ts1, {ATTRIBUTES=>{‘mykey’=>’myvalue’}}