伪分布式HBASE安装踩坑记录

最初在MacOS Catalina上直接安装HBASE,果断失败。后转docker。记录一下安装步骤和踩坑过程。以下所有安装过程均在docker中进行。

安装HDFS

基础镜像选择

个人比较喜欢CentOS,所以果断随便选择了一个基础的CentOS 8镜像。最初用的是CentOS 7,但是7在docker中的systemd好像本身就有一些问题,没有得到修复,踩坑之后使用8.

创建并启动容器

创建并启动容器时,注意参数的设置。

docker pull roboxes/centos8:latest
docker run -it -d --privileged --name centos8 roboxes/centos8:latest /usr/sbin/init
# 以root身份进入容器
docker exec -it centos8 su

安装依赖

yum -y install java-1.8.0-openjdk \
  java-1.8.0-openjdk-devel \
  which \
  sudo \
  wget \
  tree \
  openssh-server \
  openssh-clients \
  initscripts

创建hadoop用户组

groupadd hadoop

创建用户

useradd -m -g hadoop hadoop \
 && useradd -m -g hadoop hdfs \
 && useradd -m -g hadoop mapred \
 && useradd -m -g hadoop yarn \
 && useradd -m -g hadoop hbase
 # 所有用户都在/home下自动创建目录,且归属于hadoop用户组

安装包

第一遍测试时,在本机上拷贝到容器中;准备之后直接在容器启动时挂载当前目录下自己创建的./share目录和./secret目录,这样方便完全分布式各节点间资源的共享和配置的完全一致。

# 都是先在本机上下载完成再拷贝到容器中
# 容器外
docker cp ./hadoop-3.3.1-aarch64.tar.gz centos8:/usr/local/
#进入容器
cd /usr/local
tar -xzvf ./hadoop-3.3.1-aarch64.tar.gz
cd /hadoop-3.3.1

看下hadoop目录中的文件:

image20211207185534773.png

其中etc为配置目录,logs为日志目录。

对hadoop用户组中所有用户赋予hadoop目录下rwx权限:

chown -R hadoop:hadoop hadoop-3.3.1 
chmod -R g+rwx

修改环境变量

vi /etc/profile # 对所有用户生效
export HADOOP_HOME=/usr/local/hadoop-3.3.1
export HBASE_HOME=/home/hbase/hbase-2.4.8
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-2.el8_5.x86_64
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$HBASE_HOME/bin
source /etc/profile

修改Hadoop中JAVA_HOME

此处有坑:根据文件描述,除了OSX系统,其他所有系统在启动hadoop前都要再在hadoop-env.sh中重新指定一遍。

HDFS_JAVA_HOME=/usr

配置ssh

此处有坑。

ssh localhost

# 如果连接需要密码或者连接失败,执行下述指令,创建密钥对
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
# 如果失败,还可以重新启动sshd
/usr/sbin/sshd
# 如果还是需要密码或者连接失败,执行下述指令
ls -l /run/nologin 
rm -rf /run/nologin
# ssh 可以在各个用户下都如上述生成密钥一样配置一遍
su hdfs # 切换用户,Ctrl+D退出当前用户

HDFS初始化

以下操作必须切换成hdfs用户进行操作:

hdfs namenode -format

启动HDFS

start-dfs.sh
# 使用jps命令查看进程

HDFS 示例

# 在hdfs下创建用户目录,官网垃圾指南不带‘/’,创建目录必须结尾加上'/‘
hdfs dfs -mkdir /user
hdfs dfs -mkdir /user/catcher/
hdfs dfs -mkdir /input
hdfs dfs -put $HADOOP_HOME/etc/hadoop/*.xml /input
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep /input /output 'dfs[a-z.]+'
hdfs dfs -get /output ./output
cat ./output/*
# 停止HDFS
stop-dfs.sh

HADOOP 相关文件配置

core.xml:

添加属性:

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>

hdfs-site.xml:

    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

安装HBASE

安装包

同hadoop将压缩包拷贝到容器中:

docker cp /hbase-2.4.8-bin.tar.gz /home/hbase/
su hbase
tar -xzvf ./hbase-2.4.8-bin.tar.gz

image20211207221848298.png

conf中为配置文件。

配置文件

hbase-env.sh:

export HBASE_MANAGES_ZK=true
export JAVA_HOME=/usr

hbase-site.xml:

注意:

  • rootdir中设置hdfs文件目录时,中间是hostname主机名加上端口号,端口号与HADOOP中core.xml配置相同
  • 指定WAL模式为filesystem,否则会出现WARN,然后失败
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>
  <!--
    The following properties are set for running HBase as a single process on a
    developer workstation. With this configuration, HBase is running in
    "stand-alone" mode and without a distributed file system. In this mode, and
    without further configuration, HBase and ZooKeeper data are stored on the
    local filesystem, in a path under the value configured for `hbase.tmp.dir`.
    This value is overridden from its default value of `/tmp` because many
    systems clean `/tmp` on a regular basis. Instead, it points to a path within
    this HBase installation directory.

    Running against the `LocalFileSystem`, as opposed to a distributed
    filesystem, runs the risk of data integrity issues and data loss. Normally
    HBase will refuse to run in such an environment. Setting
    `hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior,
    permitting operation. This configuration is for the developer workstation
    only and __should not be used in production!__

    See also https://hbase.apache.org/book.html#standalone_dist
  -->
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  
  <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop:9000/hbase</value>
</property>
   <property>
   <name>hbase.wal.provider</name>
   <value>filesystem</value>
</property>>
  
</configuration>

将HADOOP 的core.xml和hdfs-site.xml都拷贝到conf目录下:

cp XXX ./conf/

修改主机名

重要:修改/etc/hosts

如果是完全分布式集群,私网地址对应本机,外网地址对应要连接的主机名。(即Master上127.0.0.1对应Master主机名,Slave上外网地址对应Master主机名)

127.0.0.1       hadoop
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.2      localhost

在hdfs下创建hbase目录

# 因为只有user=hdfs时才可以对hdfs文件系统进行写操作,hbase启动时无法创建目录,所以需要手动创建
su hdfs
hdfs dfs -mkdir /hbase
hdfs dfs -chown hbase /hbase
hdfs dfs -ls /
Found 1 items
drwxr-xr-x   - hbase supergroup          0 2021-12-06 23:52 /hbase

启动hbase服务

之前必须保证hdfs服务已经被正常启动

su hbase
start-hbase.sh

如果正常启动,使用jps查看Java进程,可以看见:

image20211207223611244.png

连接hbase shell

hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.8, rf844d09157d9dce6c54fcd53975b7a45865ee9ac, Wed Oct 27 08:48:57 PDT 2021
Took 0.0019 seconds                                                                                   
hbase:001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
Took 0.9167 seconds                                                                                   
hbase:002:0> list
TABLE                                                                                                 
0 row(s)
Took 0.0306 seconds                                                                                   
=> []
hbase:003:0> create 'test' , 'col1'
Created table test
Took 2.2843 seconds                                                                                   
=> Hbase::Table - test
hbase:004:0> list 'test'
TABLE                                                                                                 
test                                                                                                  
1 row(s)
Took 0.0110 seconds                                                                                   
=> ["test"]

Q&S

  1. 多次执行hdfs文件系统格式化,导致datanode版本和namenode不一致从而hdfs启动时无法启动datanode:

    解决方案:CentOS下在/tmp/hadoop-hdfs/dfs/中,有各个节点的版本信息,可以将data文件夹删除,然后重新启动hdfs,各节点版本就会恢复一致。datanode应能成功启动,否则hbase也无法启动成功(报错:无datanode无法创建replication)。

  2. 类似:java.net.ConnectException: Call From 7998577446f3/172.17.0.2 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused;Unknown host

    解决方案:修改/etc/host;同时检查配置文件中主机名称和端口号是否正确。

  3. 如果不指定HBASE的WAL模式,会出现类似instance not class 的错误。

  4. 如果报错无法找到host且2也正确,将hadoop的core.xml和hfs-site.xml文件拷贝到HBASE的conf文件夹下。

Hbase 压测

Hbase自带了压测工具pe,使用hbase pe指令可以出现参数说明:

hbase pe
Usage: hbase pe  <OPTIONS> [-D<property=value>]* <command> <nclients>

General Options:
 nomapred        Run multiple clients using threads (rather than use mapreduce)
 oneCon          all the threads share the same connection. Default: False
 connCount          connections all threads share. For example, if set to 2, then all thread share 2 connection. Default: depend on oneCon parameter. if oneCon set to true, then connCount=1, if not, connCount=thread number
 sampleRate      Execute test on a sample of total rows. Only supported by randomRead. Default: 1.0
 period          Report every 'period' rows: Default: opts.perClientRunRows / 10 = 104857
 cycles          How many times to cycle the test. Defaults: 1.
 traceRate       Enable HTrace spans. Initiate tracing every N rows. Default: 0
 latency         Set to report operation latencies. Default: False
 latencyThreshold  Set to report number of operations with latency over lantencyThreshold, unit in millisecond, default 0
 measureAfter    Start to measure the latency once 'measureAfter' rows have been treated. Default: 0
 valueSize       Pass value size to use: Default: 1000
 valueRandom     Set if we should vary value size between 0 and 'valueSize'; set on read for stats on size: Default: Not set.
 blockEncoding   Block encoding to use. Value should be one of [NONE, PREFIX, DIFF, FAST_DIFF, ROW_INDEX_V1]. Default: NONE

Table Creation / Write Tests:
 table           Alternate table name. Default: 'TestTable'
 rows            Rows each client runs. Default: 1048576.  In case of randomReads and randomSeekScans this could be specified along with --size to specify the number of rows to be scanned within the total range specified by the size.
 size            Total size in GiB. Mutually exclusive with --rows for writes and scans. But for randomReads and randomSeekScans when you use size with --rows you could use size to specify the end range and --rows specifies the number of rows within that range. Default: 1.0.
 compress        Compression type to use (GZ, LZO, ...). Default: 'NONE'
 flushCommits    Used to determine if the test should flush the table. Default: false
 valueZipf       Set if we should vary value size between 0 and 'valueSize' in zipf form: Default: Not set.
 writeToWAL      Set writeToWAL on puts. Default: True
 autoFlush       Set autoFlush on htable. Default: False
 multiPut        Batch puts together into groups of N. Only supported by write. If multiPut is bigger than 0, autoFlush need to set to true. Default: 0
 presplit        Create presplit table. If a table with same name exists, it'll be deleted and recreated (instead of verifying count of its existing regions). Recommended for accurate perf analysis (see guide). Default: disabled
 usetags         Writes tags along with KVs. Use with HFile V3. Default: false
 numoftags       Specify the no of tags that would be needed. This works only if usetags is true. Default: 1
 splitPolicy     Specify a custom RegionSplitPolicy for the table.
 columns         Columns to write per row. Default: 1
 families        Specify number of column families for the table. Default: 1

Read Tests:
 filterAll       Helps to filter out all the rows on the server side there by not returning any thing back to the client.  Helps to check the server side performance.  Uses FilterAllFilter internally. 
 multiGet        Batch gets together into groups of N. Only supported by randomRead. Default: disabled
 inmemory        Tries to keep the HFiles of the CF inmemory as far as possible. Not guaranteed that reads are always served from memory.  Default: false
 bloomFilter     Bloom filter type, one of [NONE, ROW, ROWCOL, ROWPREFIX_FIXED_LENGTH]
 blockSize       Blocksize to use when writing out hfiles. 
 inmemoryCompaction  Makes the column family to do inmemory flushes/compactions. Uses the CompactingMemstore
 addColumns      Adds columns to scans/gets explicitly. Default: true
 replicas        Enable region replica testing. Defaults: 1.
 randomSleep     Do a random sleep before each get between 0 and entered value. Defaults: 0
 caching         Scan caching to use. Default: 30
 asyncPrefetch   Enable asyncPrefetch for scan
 cacheBlocks     Set the cacheBlocks option for scan. Default: true
 scanReadType    Set the readType option for scan, stream/pread/default. Default: default
 bufferSize      Set the value of client side buffering. Default: 2MB

 Note: -D properties will be applied to the conf used. 
  For example: 
   -Dmapreduce.output.fileoutputformat.compress=true
   -Dmapreduce.task.timeout=60000

Command:
 append               Append on each row; clients overlap on keyspace so some concurrent operations
 asyncRandomRead      Run async random read test
 asyncRandomWrite     Run async random write test
 asyncScan            Run async scan test (read every row)
 asyncSequentialRead  Run async sequential read test
 asyncSequentialWrite Run async sequential write test
 checkAndDelete       CheckAndDelete on each row; clients overlap on keyspace so some concurrent operations
 checkAndMutate       CheckAndMutate on each row; clients overlap on keyspace so some concurrent operations
 checkAndPut          CheckAndPut on each row; clients overlap on keyspace so some concurrent operations
 cleanMeta            Remove fake region entries on meta table inserted by metaWrite; used with 1 thread
 filterScan           Run scan test using a filter to find a specific row based on it's value (make sure to use --rows=20)
 increment            Increment on each row; clients overlap on keyspace so some concurrent operations
 metaRandomRead       Run getRegionLocation test
 metaWrite            Populate meta table;used with 1 thread; to be cleaned up by cleanMeta
 randomRead           Run random read test
 randomSeekScan       Run random seek and scan 100 test
 randomWrite          Run random write test
 scan                 Run scan test (read every row)
 scanRange10          Run random seek scan with both start and stop row (max 10 rows)
 scanRange100         Run random seek scan with both start and stop row (max 100 rows)
 scanRange1000        Run random seek scan with both start and stop row (max 1000 rows)
 scanRange10000       Run random seek scan with both start and stop row (max 10000 rows)
 sequentialRead       Run sequential read test
 sequentialWrite      Run sequential write test

Args:
 nclients        Integer. Required. Total number of clients (and HRegionServers) running. 1 <= value <= 500
Examples:
 To run a single client doing the default 1M sequentialWrites:
 $ hbase pe sequentialWrite 1
 To run 10 clients doing increments over ten rows:
 $ hbase pe --rows=10 --nomapred increment 10

对Hbase进行随机写取的压力测试:

hbase pe --nomapred --oneCon=true --valueSize=100 --rows=150000 --autoFlush=true --presplit=64 randomWrite 32
  • --nomapred: 不使用mapreduce而是采取直接开启多线程,线程数目设置为32
  • --oneCon:多个线程使用同一个数据库连接
  • --valueSize:每行数据的大小为100字节
  • rows:每个线程写入150000行数据
    压测的结果会直接写入log,并且在终端显示:
    结果汇总:
2021-12-08 02:07:43,177 INFO  [main] hbase.PerformanceEvaluation: [RandomWriteTest] Summary of timings (ms): [574340, 575492, 575942, 575767, 575317, 575717, 575669, 575595, 575856, 574974, 575712, 575681, 574228, 575398, 574289, 575279, 575914, 575192, 575536, 574928, 574870, 575811, 575836, 575688, 575257, 574956, 574842, 574800, 574091, 574906, 575124, 574990]
2021-12-08 02:07:43,582 INFO  [main] hbase.PerformanceEvaluation: [RandomWriteTest duration ]   Min: 574091ms        Max: 575942ms   Avg: 575249ms
2021-12-08 02:07:43,582 INFO  [main] hbase.PerformanceEvaluation: [ Avg latency (us)]   3827
2021-12-08 02:07:43,582 INFO  [main] hbase.PerformanceEvaluation: [ Avg TPS/QPS]        8344     row per second

Q.E.D.


南七技校练习时长四年的练习生