Hbase: ------ 概述、列存储、安装

Apache Hbase

概述

HBase 是一个基于Hadoop的分布式,可扩展,巨大数据仓库.当用户需要对海量数据进行实时(时效性)随机(记录级别数据)读/写,用户可以使用Hbase.Hbase设计目标是能够持有一张巨大的表,该的规模能达到数十亿行 X 数百万列,并且可以运行在商用的硬件集群之上. Hbase是一个开源的,分布式,版本化的非关系化的数据库-NoSQL,改设计仿照了Google的BigTable设计.

HDFS和HBase区别
1587022472175)(assets/1578278106225.png)]

Hbase是构建在HDFS之上的一个数据库服务,能够使得用户通过HBase数据库服务间接的操作HDFS,能够使得用户对HDFS上的数据实现CRUD操作(细粒度操作)。

Hbase特性-官方

  • 线性和模块化扩展。
  • 严格一致 reads 和 writes.
  • 表的自动和可配置分片(自动分区)
  • RegionServers之间的自动故障转移支持。
  • 方便的基类,用于使用Apache HBase表支持Hadoop MapReduce作业。
  • 易于使用的Java API,用于客户端访问。
  • Block cache 和 Bloom Filters 以进行实时查询。

列存储

常见的NoSQL数据库常见分类:Key-Value- Redis|SSDB Document - MongoDB|Elasticsearch|Solr 列存储 - HBase 图像关系 - Neo4j 等.和关系数据库不同,NoSQL不同种类产品之间不可相互替换.

  • 存储特点-RDBMS
IDnamepasswordagesexaddress
1zhangsan12345618true北上地
2lisi12345620null北京朝阳
3wangwu123456nullnullnull
4zhaoliu123456nullfalsenull

思考:

select id,name,passowrd from t_user where name='zhangsan' and password='123456'
  • 按照name和 password索引快速定位当前记录
  • 数据库底层加载id/name/password/age/sex/address
  • 进行投影过滤出id/name/password

从上面过程不难看出age/sex/address的读取过程是多余的,这一部分IO的读取对于系统而言浪费.- IO利用率低;其次关系型数据库由于不支持稀疏存储(null值不存储),导致null值也会占用磁盘空间,给系统带来磁盘空间的浪费-磁盘利用率低.

  • 解决之道(列共现性问题)

t_user_base

idnamepassword
1zhangsan123456
2lisi123456

| 3 | wangwu | 123456 |
| 4 | zhaoliu | 123456 |

t_user_detail

IDagesexaddress
118true北京上地
220null北京朝阳
4nullfalsenull
  • 列存储(hbase)
    在这里插入图片描述
    RowKey:等价关系型数据库的主键ID

列簇:将IO操作特性相似的列归为一个簇,Hbase底层会以列簇为单位索引数据.

:列簇/列名/列值/时间戳构成

时间戳:用于记录Hbase中数据的版本,一般系统会自动指定为插入数据时间

Hbase安装

HDFS基本环境(存储)

1,安装JDK,配置环境变量JAVA_HOME

[root@CentOS ~]# rpm -ivh jdk-8u171-linux-x64.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:jdk1.8-2000:1.8.0_171-fcs        ################################# [100%]
Unpacking JAR files...
        tools.jar...
        plugin.jar...
        javaws.jar...
        deploy.jar...
        rt.jar...
        jsse.jar...
        charsets.jar...
        localedata.jar...
[root@CentOS ~]# vi .bashrc

JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME
export CLASSPATH
export PATH  

[root@CentOS ~]# source .bashrc 
[root@CentOS ~]# jps
1933 Jps

2,关闭防火墙

[root@CentOS ~]# systemctl stop firewalld # 关闭 服务
[root@CentOS ~]# systemctl disable firewalld # 关闭开机自启动
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@CentOS ~]# firewall-cmd --state
not running

3,配置主机名和IP映射关系

[root@CentOS ~]# cat /etc/hostname 
CentOS
[root@CentOS ~]# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.186.150 CentOS

4,配置SSH免密码登录

[root@CentOS ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:6yYiypvclJAZLU2WHvzakxv6uNpsqpwk8kzsjLv3yJA root@CentOS
The key's randomart image is:
+---[RSA 2048]----+
|  .o.            |
|  =+             |
| o.oo            |
|  =. .           |
| +  o . S        |
| o...=   .       |
|E.oo. + .        |
|BXX+o....        |
|B#%O+o o.        |
+----[SHA256]-----+
[root@CentOS ~]# ssh-copy-id CentOS
[root@CentOS ~]# ssh CentOS
Last failed login: Mon Jan  6 14:30:49 CST 2020 from centos on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Mon Jan  6 14:20:27 2020 from 192.168.186.1

5,上传Hadoop安装包,并解压到/usr目录

[root@CentOS ~]# tar -zxf  hadoop-2.9.2.tar.gz -C /usr/

6,配置HADOOP_HOME环境变量

[root@CentOS ~]# vi .bashrc
HADOOP_HOME=/usr/hadoop-2.9.2
JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME
export CLASSPATH
export PATH
export HADOOP_HOME
[root@CentOS ~]# source .bashrc    
[root@CentOS ~]# hadoop classpath #打印Hadoop的类路径
/usr/hadoop-2.9.2/etc/hadoop:/usr/hadoop-2.9.2/share/hadoop/common/lib/*:/usr/hadoop-2.9.2/share/hadoop/common/*:/usr/hadoop-2.9.2/share/hadoop/hdfs:/usr/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/usr/hadoop-2.9.2/share/hadoop/hdfs/*:/usr/hadoop-2.9.2/share/hadoop/yarn:/usr/hadoop-2.9.2/share/hadoop/yarn/lib/*:/usr/hadoop-2.9.2/share/hadoop/yarn/*:/usr/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/usr/hadoop-2.9.2/share/hadoop/mapreduce/*:/usr/hadoop-2.9.2/contrib/capacity-scheduler/*.jar

7,修改core-site.xml

[root@CentOS ~]# vi /usr/hadoop-2.9.2/etc/hadoop/core-site.xml
<!--nn访问入口-->
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://CentOS:9000</value>
</property>
<!--hdfs工作基础目录-->
<property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/hadoop-2.9.2/hadoop-${user.name}</value>
</property>

8,修改hdfs-site.xml

[root@CentOS ~]# vi /usr/hadoop-2.9.2/etc/hadoop/hdfs-site.xml 
<!--block副本因子-->
<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
<!--配置Sencondary namenode所在物理主机-->
<property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>CentOS:50090</value>
</property>
<!--设置datanode最大文件操作数-->
<property>
        <name>dfs.datanode.max.xcievers</name>
        <value>4096</value>
</property>
<!--设置datanode并行处理能力-->
<property>
        <name>dfs.datanode.handler.count</name>
        <value>6</value>
</property>

9,修改slaves

[root@CentOS ~]# vi /usr/hadoop-2.9.2/etc/hadoop/slaves 
CentOS

10,格式化NameNode,生成fsimage

[root@CentOS ~]# hdfs namenode -format
[root@CentOS ~]# yum install -y tree
[root@CentOS ~]# tree /usr/hadoop-2.9.2/hadoop-root/
/usr/hadoop-2.9.2/hadoop-root/
└── dfs
    └── name
        └── current
            ├── fsimage_0000000000000000000
            ├── fsimage_0000000000000000000.md5
            ├── seen_txid
            └── VERSION

3 directories, 4 files

11,启动HDFS服务

[root@CentOS ~]# start-dfs.sh 

Zookeeper安装(协调)

1,上传zookeeper的安装包,并解压在/usr目录下

[root@CentOS ~]# tar -zxf zookeeper-3.4.12.tar.gz -C /usr/

2,配置Zookepeer的zoo.cfg

[root@CentOS ~]# tar -zxf zookeeper-3.4.12.tar.gz -C /usr/
[root@CentOS ~]# cd /usr/zookeeper-3.4.12/
[root@CentOS zookeeper-3.4.12]# cp conf/zoo_sample.cfg conf/zoo.cfg
[root@CentOS zookeeper-3.4.12]# vi conf/zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/root/zkdata
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

3,创建zookeeper的数据目录

[root@CentOS ~]# mkdir /root/zkdata

4,启动zookeeper服务

[root@CentOS ~]# cd /usr/zookeeper-3.4.12/
[root@CentOS zookeeper-3.4.12]# ./bin/zkServer.sh start zoo.cfg
ZooKeeper JMX enabled by default
Using config: /usr/zookeeper-3.4.12/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@CentOS zookeeper-3.4.12]# ./bin/zkServer.sh status zoo.cfg
ZooKeeper JMX enabled by default
Using config: /usr/zookeeper-3.4.12/bin/../conf/zoo.cfg
Mode: standalone

Hbase配置与安装(数据库服务)

1,上传Hbase安装包,并解压到/usr目录下

[root@CentOS ~]# tar -zxf hbase-1.2.4-bin.tar.gz -C /usr/

2,配置Hbase环境变量HBASE_HOME

[root@CentOS ~]# vi .bashrc 
HBASE_HOME=/usr/hbase-1.2.4
HADOOP_HOME=/usr/hadoop-2.9.2
JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
export JAVA_HOME
export CLASSPATH
export PATH
export HADOOP_HOME
export HBASE_HOME

[root@CentOS ~]# source .bashrc 
[root@CentOS ~]# hbase classpath # 测试Hbase是否识别Hadoop
/usr/hbase-1.2.4/conf:/usr/java/latest/lib/tools.jar:/usr/hbase-1.2.4:/usr/hbase-1.2.4/lib/activation-1.1.jar:/usr/hbase-1.2.4/lib/aopalliance-1.0.jar:/usr/hbase-1.2.4/lib/apacheds-i18n-2.0.0-M15.jar:/usr/hbase-1.2.4/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/hbase-1.2.4/lib/api-asn1-api-1.0.0-M20.jar:/usr/hbase-1.2.4/lib/api-util-1.0.0-M20.jar:/usr/hbase-1.2.4/lib/asm-3.1.jar:/usr/hbase-1.2.4/lib/avro-
...
1.7.4.jar:/usr/hbase-1.2.4/lib/commons-beanutils-1.7.0.jar:/usr/hbase-1.2.4/lib/commons-
2.9.2/share/hadoop/yarn/*:/usr/hadoop-2.9.2/share/hadoop/mapreduce/lib/*:/usr/hadoop-2.9.2/share/hadoop/mapreduce/*:/usr/hadoop-2.9.2/contrib/capacity-scheduler/*.jar

3,配置hbase-site.xml

[root@CentOS ~]# cd /usr/hbase-1.2.4/
[root@CentOS hbase-1.2.4]# vi conf/hbase-site.xml
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://CentOS:9000/hbase</value>
</property>
<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
</property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>CentOS</value>
</property>
<property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
</property>

4,修改hbase-env.sh,将HBASE_MANAGES_ZK修改为false

[root@CentOS ~]# cd /usr/hbase-1.2.4/
[root@CentOS hbase-1.2.4]# grep -i HBASE_MANAGES_ZK conf/hbase-env.sh 
# export HBASE_MANAGES_ZK=true
[root@CentOS hbase-1.2.4]# vi conf/hbase-env.sh 
export HBASE_MANAGES_ZK=false
[root@CentOS hbase-1.2.4]# grep -i HBASE_MANAGES_ZK conf/hbase-env.sh 
export HBASE_MANAGES_ZK=false

export HBASE_MANAGES_ZK=false告知Hbase,使用外部Zookeeper

5,启动Hbase

[root@CentOS hbase-1.2.4]# ./bin/start-hbase.sh 
starting master, logging to /usr/hbase-1.2.4/logs/hbase-root-master-CentOS.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
starting regionserver, logging to /usr/hbase-1.2.4/logs/hbase-root-1-regionserver-CentOS.out
[root@CentOS hbase-1.2.4]# jps
3090 NameNode
5027 HMaster
3188 DataNode
5158 HRegionServer
3354 SecondaryNameNode
5274 Jps
3949 QuorumPeerMain

6,验证Hbase安装是否成功

WebUI验证 http://192.168.186.150:16010/
在这里插入图片描述

- Hbase shell验证(靠谱)

[root@CentOS hbase-1.2.4]# ./bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hbase-1.2.4/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017

hbase(main):001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load

hbase(main):002:0> version
1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017

hbase(main):003:0> 

- 链接HDFS查看

WebUI验证 http://192.168.186.150:50070/
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值