1.环境安装
准备三台2C 8G的CentOS7.2的服务器,设置三台主机名分别为ruozedata001,ruozedata002,ruozedata003。准备好所需的软件
- jdk-8u45-linux-x64.gz
- zookeeper-3.4.6.tar.gz
- hadoop-2.6.0-cdh5.15.1.tar.gz
1.1 三台服务器绑定内网ip跟hostname
[root@ruozedata001~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.69.173 ruozedata001
172.16.69.174 ruozedata002
172.16.69.172 ruozedata003
1.2 主机之间配置信任关系
[hadoop@ruozedata001 ~]$ ssh-keygen
[hadoop@ruozedata001 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
将ruozedata002跟ruozedata003的公钥文件上传到ruozedata001上
[hadoop@ruozedata001 ~]$ cat ~/.ssh/id_rsa_2.pub >> ~/.ssh/authorized_keys
[hadoop@ruozedata001 ~]$ cat ~/.ssh/id_rsa_3.pub >> ~/.ssh/authorized_keys
把~/.ssh/authorized_keys文件分别上传到另外两台服务器上,并设置权限
[hadoop@ruozedata001 ~]$ chmod 0600 ~/.ssh/authorized_keys
验证(每台机器上执行下面3条命令,只输入yes,不输入密码,则这3台互相通信了)
ssh ruozedata001 date
ssh ruozedata002 date
ssh ruozedata003 date
以上就将服务器的环境准备好了,接下来开始软件的安装
2.安装
2.1 安装jdk
[root@ruozedata001 ~]# mkdir /usr/java/
[root@ruozedata001 ~]# cd /usr/java/
上传或者下载jdk文件,解压jdk压缩包的时候要注意所属组跟所属用户的相关权限
配置/etc/profile环境变量
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin/:$PATH
[root@ruozedata001 java]# source /etc/profile
[root@ruozedata001 java]# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
2.2 zookeeper的安装
[hadoop@ruozedata001 ~]$ tar -zxvf /home/hadoop/software/zookeeper-3.4.6.tar.gz -C /home/hadoop/app/zookeeper
设置zookeeper环境变量
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper
export PATH=$ZOOKEEPER_HOME/bin/:$PATH
[hadoop@ruozedata001 ~]# source ~/bash_profile
配置文件的修改
[hadoop@ruozedata001 conf]$ cp zoo_sample.cfg zoo.cfg
[hadoop@ruozedata001 conf]$ cat zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/hadoop/data/zookeeper 存放zookeeper的数据存储目录
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#各个服务器之间的通信
server.1=ruozedata001:2888:3888
server.2=ruozedata002:2888:3888
server.3=ruozedata003:2888:3888
[hadoop@ruozedata001 conf]$ mkdir /home/hadoop/data/zookeeper
这是关于zookeeper集群的标识,在各自的服务器上执行
[hadoop@ruozedata001 conf]$ echo 1 > /home/hadoop/data/zookeeper/myid
[hadoop@ruozedata002 conf]$ echo 2 > /home/hadoop/data/zookeeper/myid
[hadoop@ruozedata003 conf]$ echo 3 > /home/hadoop/data/zookeeper/myid
接下来就可以启动zookeeper
[hadoop@ruozedata001 conf]$ zkServer.sh start
2.3 hadoop安装
[hadoop@ruozedata001 ~]$ tar -zxvf /home/hadoop/software/hadoop-2.6.0-cdh5.15.1.tar.gz -C /home/hadoop/app/hadoop
设置hadoop环境变量
export HADOOP_HOME=/home/hadoop/app/hadoop
export PATH=$HADOOP_HOME/bin/:$HADOOP_HOME/sbin:$PATH
[hadoop@ruozedata001 ~]$ source ~/bash_profile
修改hadoop的相关配置文件
core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml slaves文件需要按照业务来进行设置
安装好hadoop之后,需要格式化namenode,再格式化namenode之前,需要先启动JournalNode进程,需要分别启动三台服务器的JournalNode
[hadoop@ruozedata001 ~]$ hadoop-daemon.sh start journalnode
[hadoop@ruozedata002 ~]$ hadoop-daemon.sh start journalnode
[hadoop@ruozedata003 ~]$ hadoop-daemon.sh start journalnode
开始格式化namenode
[hadoop@ruozedata001 ~]$ hadoop namenode -format
格式化成功之后,将ruozedata001的元数据传输到ruozedata002的对应目录中
[hadoop@ruozedata001 ~]$ scp -r /home/hadoop/data/dfs/data/name ruozedata002:/home/hadoop/data/dfs/data/
格式化ZKFC
[hadoop@ruozedata001 ~]$ hdfs zkfc -formatZK
2.4 启动HDFS、YARN
HDFS启动
[hadoop@ruozedata001 ~]$ start-dfs.sh
Starting namenodes on [ruozedata001 ruozedata002]
ruozedata002: namenode running as process 15364. Stop it first.
ruozedata001: namenode running as process 16063. Stop it first.
ruozedata001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-datanode-ruozedata001.out
ruozedata002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.15.1/logs/hadoop-hadoop-datanode-ruozedata002.out
ruozedata003: datanode running as process 10961. Stop it first.
Starting journal nodes [ruozedata001 ruozedata002 ruozedata003]
ruozedata003: journalnode running as process 9791. Stop it first.
ruozedata002: journalnode running as process 18956. Stop it first.
ruozedata001: journalnode running as process 9802. Stop it first.
Starting ZK Failover Controllers on NN hosts [ruozedata001 ruozedata002]
ruozedata002: zkfc running as process 12387. Stop it first.
ruozedata001: zkfc running as process 12570. Stop it first.
YARN启动,但ruozedata002需要自己手动启动
[hadoop@ruozedata001 ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/software/hadoop/logs/yarn-root-resourcemanagerhadoop001.out
hadoop002: starting nodemanager, logging to /opt/software/hadoop/logs/yarn-rootnodemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /opt/software/hadoop/logs/yarn-rootnodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /opt/software/hadoop/logs/yarn-rootnodemanager-hadoop001.out
[hadoop@ruozedata002 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/software/hadoop/logs/yarn-root-resourcemanagerhadoop002.out
启动jobhistory
[hadoop@ruozedata002 ~]$ mr-jobhistory-daemon.sh start historyserver
至此,hadoop集群已经部署完成。