目錄
1.前期準(zhǔn)備
1.1.hadoop-3.1.3.tar.gz,jdk-8u212-linux-x64.tar.gz,apache-zookeeper-3.5.7-bin.tar.gz三個(gè)包提取碼:k5y6
2.解壓安裝包,配置環(huán)境變量
3. 將三個(gè)節(jié)點(diǎn)分別命名為master、slave1、slave2并做免密登錄
免密在前面Hadoop完全分布式搭建說(shuō)過(guò),這里不再贅述
4.搭建zookeeper集群
?根據(jù)配置的路徑新建zkdata,zkdatalog目錄。然后到zkdata目錄中可以touch新建一個(gè)文件myid,也可以直接echo寫入為1,另外slave1,salve2分別對(duì)應(yīng)2,3。?編輯
?5.分發(fā)解壓后的java,/etc/profile,zookeeper修改myid為2,3
6.啟動(dòng)zookeeper
查看狀態(tài)
vim core-site.xml
vim hdfs-site.xml
vim yarn-site.xml
其余幾個(gè)配置和前面Hadoop完全分布式一樣
6.分發(fā)Hadoop
7.首次啟動(dòng)HDFS的HA模式,步驟如下
7.1.在虛擬機(jī)master上啟動(dòng)zookeeper集群
7.2.在虛擬機(jī)master上格式化zookeeper
7.3.分別在虛擬機(jī)master,slave1,slave2上啟動(dòng)journalnode進(jìn)程
7.4.然后格式化
7.5.
?start-all.sh報(bào)錯(cuò)
??hadoop-daemon.sh start namenode單獨(dú)啟動(dòng)master上的namenode
? hdfs namenode -bootstrapStandby再在另外你要起的虛擬機(jī)上同步namenode
最后?start-all.sh
8.在Master節(jié)點(diǎn)上使用命令分別查看服務(wù)nn2與rm2進(jìn)程狀態(tài)
hdfs haadmin -getServiceState nn2
yarn rmadmin -getServiceState rm2
主機(jī)名 |
IP地址 |
相關(guān)進(jìn)程 |
master |
根據(jù)自己的 |
NameNode,DataNode, DFSZKFailoverController, QuorumPeerMain,JournalNode, ResourceManager,NodeMananger |
slave1 |
根據(jù)自己的 |
NameNode,DataNode, DFSZKFailoverController, QuorumPeerMain,JournalNode, ResourceManager,NodeMananger |
slave2 |
根據(jù)自己的 |
DataNode, NodeMananger, QuorumPeerMain, JournalNode |
1.前期準(zhǔn)備
1.1.hadoop-3.1.3.tar.gz,jdk-8u212-linux-x64.tar.gz,apache-zookeeper-3.5.7-bin.tar.gz三個(gè)包提取碼:k5y6
2.解壓安裝包,配置環(huán)境變量
tar -zxf? tar包? -C? 指定目錄
?解壓后
?apache-zookeeper-3.5.7-bin名字好長(zhǎng)不太習(xí)慣可以用mv改名
?或者ln -s 軟鏈接
vim /etc/profile配置環(huán)境變量,source /etc/profile使環(huán)境變量生效
驗(yàn)證
hadoop version
java -version
3. 將三個(gè)節(jié)點(diǎn)分別命名為master、slave1、slave2并做免密登錄
修改主機(jī)名,斷開重連
hostnamectl set-hostname 主機(jī)名
免密在前面Hadoop完全分布式搭建說(shuō)過(guò),這里不再贅述
4.搭建zookeeper集群
cd /opt/module/zookeeper/conf
cp zoo_sample.cfg zoo.cfg
編輯zoo.cfg新增下列配置
?
?根據(jù)配置的路徑新建zkdata,zkdatalog目錄。然后到zkdata目錄中可以touch新建一個(gè)文件myid,也可以直接echo寫入為1,另外slave1,salve2分別對(duì)應(yīng)2,3。
?5.分發(fā)解壓后的java,/etc/profile,zookeeper修改myid為2,3
scp -r /opt/module/jdk1.8.0_212/ slave1:/opt/module/
scp -r /opt/module/jdk1.8.0_212/ slave2:/opt/module/
scp /etc/profile slave1:/etc/profile
scp /etc/profile slave2:/etc/profile(不要忘記source)scp -r /opt/module/zookeeper/ slave1:/opt/module/
scp -r /opt/module/zookeeper/ slave2:/opt/module/
6.啟動(dòng)zookeeper
zkServer.sh start
查看狀態(tài)
zkServer.sh status
cd /opt/module/hadoop-3.1.3/etc/hadoop
vim core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-3.1.3/tmpdir</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
<description>
A list of ZooKeeper server addresses, separated by commas, that are
to be used by the ZKFailoverController in automatic failover.
</description>
</property>
vim hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster</value>
<description>
Comma-separated list of nameservices.
</description>
</property>
<property>
<name>dfs.ha.namenodes.cluster</name>
<value>nn1,nn2</value>
<description>
The prefix for a given nameservice, contains a comma-separated
list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
Unique identifiers for each NameNode in the nameservice, delimited by
commas. This will be used by DataNodes to determine all the NameNodes
in the cluster. For example, if you used ?@~\mycluster?@~] as
thh
e nameservice
ID previously, and you wanted to use ?@~\nn1?@~] and ?@~\nn22
?@@
~] as the individual
IDs of the NameNodes, you would configure a property
dfs.ha.namenodes.mycluster, and its value "nn1,nn2".
</description>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster.nn1</name>
<value>master:8020</value>
<description>
A comma separated list of auxiliary ports for the NameNode to listen on.
This allows exposing multiple NN addresses to clients.
Particularly, it is used to enforce different SASL levels on different ports.
Empty list indicates that auxiliary ports are disabled.
</description>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster.nn2</name>
<value>slave1:8020</value>
<description>
A comma separated list of auxiliary ports for the NameNode to listen on.
This allows exposing multiple NN addresses to clients.
Particularly, it is used to enforce different SASL levels on different ports.
Empty list indicates that auxiliary ports are disabled.
</description>
</property>
<property>
<name>dfs.namenode.http-address.cluster.nn1</name>
<value>master:9870</value>
<description>
The address and the base port where the dfs namenode web ui will listen on.
</description>
</property>
<property>
<name>dfs.namenode.http-address.cluster.nn2</name>
<value>slave1:9870</value>
<description>
The address and the base port where the dfs namenode web ui will listen on.
</description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave1:8485;slave2:8485/cluster</value>
<description>A directory on shared storage between the multiple namenodes
in an HA cluster. This directory will be written by the active and read
by the standby in order to keep the namespaces synchronized. This directory
does not need to be listed in dfs.namenode.edits.dir above. It should be
left empty in a non-HA cluster.
</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>
The prefix (plus a required nameservice ID) for the class name of the
configured Failover proxy provider for the host. For more detailed
information, please consult the "Configuration Details" section of
the HDFS High Availability documentation.
</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>
Whether automatic failover is enabled. See the HDFS High
Availability documentation for details on automatic HA
configuration.
</description>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
<description>
A list of scripts or Java classes which will be used to fence
the Active NameNode during a failover. See the HDFS High
Availability documentation for details on automatic HA
configuration.
</description>
</property>
vim yarn-site.xml
<property>
<description>A comma separated list of services where service name should only
contain a-zA-Z0-9_ and can not start with numbers</description>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<description>Name of the cluster. In a HA setting,
this is used to ensure the RM participates in leader
election for this cluster and ensures it does not affect
other clusters</description>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<description>The list of RM nodes in the cluster when HA is
enabled. See description of yarn.resourcemanager.ha
.enabled for full details on how this is used.</description>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<description>The hostname of the RM.</description>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value>
</property>
<property>
<description>The hostname of the RM.</description>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value>
</property>
<property>
<description>
The http address of the RM web application.
If only a host is provided as the value,
the webapp will be served on a random port.
</description>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master:8088</value>
</property>
<property>
<description>
The http address of the RM web application.
If only a host is provided as the value,
the webapp will be served on a random port.
</description>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>slave1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
其余幾個(gè)配置和前面Hadoop完全分布式一樣
6.分發(fā)Hadoop
7.首次啟動(dòng)HDFS的HA模式,步驟如下
7.1.在虛擬機(jī)master上啟動(dòng)zookeeper集群
7.2.在虛擬機(jī)master上格式化zookeeper
hdfs zkfc -formatZK
7.3.分別在虛擬機(jī)master,slave1,slave2上啟動(dòng)journalnode進(jìn)程
?hadoop-daemon.sh start journalnode
7.4.然后格式化
?hdfs namenode -format
7.5.
?start-all.sh報(bào)錯(cuò)
?添加進(jìn)環(huán)境變量
??hadoop-daemon.sh start namenode單獨(dú)啟動(dòng)master上的namenode
? hdfs namenode -bootstrapStandby再在另外你要起的虛擬機(jī)上同步namenode
最后?start-all.sh
?
8.在Master節(jié)點(diǎn)上使用命令分別查看服務(wù)nn2與rm2進(jìn)程狀態(tài)
hdfs haadmin -getServiceState nn2
yarn rmadmin -getServiceState rm2
?
報(bào)錯(cuò)了:
看看是否是hdfs-site.xml里面寫錯(cuò)了,果然
?namenode打成了namenodes,修改過(guò)來(lái)重啟,成功了
文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-764780.html
?文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-764780.html
到了這里,關(guān)于HadoopHA模式(由于Hadoop的HA模式是在Hadoop完全分布式基礎(chǔ)上,利用zookeeper等協(xié)調(diào)工具配置的高可用的Hadoop集群模式)的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!