1 Yarn常用命令
Yarn狀態(tài)查詢可以在hadoop103:8088頁(yè)面查看,也可以通過(guò)命令查看。
先運(yùn)行案例再查看運(yùn)行情況。
(1)yarn application 查看任務(wù)
yarn application -list //列出所有application
yarn application -list -appStates 狀態(tài) //狀態(tài):ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED。通過(guò)application狀態(tài)過(guò)濾
yarn application -kill application_ID號(hào) //通過(guò)application的ID號(hào)kill掉application
(2)yarn logs 查看日志
yarn logs -applicationId application_ID號(hào) //查詢application日志
yarn logs -applicationId application_ID號(hào)-containerId container_容器ID號(hào) //查詢Container日志,先指定是哪個(gè)application的,再指定是哪個(gè)容器的
(3) yarn application attempt 查看嘗試運(yùn)行的任務(wù)(即正在運(yùn)行的任務(wù)狀態(tài))
yarn applicationattempt -list application_ID號(hào) //列出所有Application嘗試的列表
yarn applicationattempt -status appattempt_ApplicationAttemptID號(hào) //打印ApplicationAttemp狀態(tài)
(4) yarn container查看容器
只有任務(wù)在執(zhí)行過(guò)程中才能看到container狀態(tài)
yarn container -list appattempt_ApplicationAttemptID號(hào) //列出所有container
yarn container -status container_ID號(hào) //打印container狀態(tài)
(5) yarn node查看節(jié)點(diǎn)狀態(tài)
yarn node -list -all //列出所有節(jié)點(diǎn)
(6)yarn rmadmin更新配置
yarn rmadmin -refreshQueues //加載隊(duì)列配置
(7) yarn queue查看隊(duì)列
yarn queue -status <QueueName>
yarn queue -status default //打印隊(duì)列信息
2 yarn生產(chǎn)環(huán)境核心參數(shù)
3 生產(chǎn)環(huán)境下配置Yarn
3臺(tái)服務(wù)器先創(chuàng)建快照。
(1)核心參數(shù)配置案例
需求:從0.5G數(shù)據(jù)中統(tǒng)計(jì)每個(gè)單詞出現(xiàn)的次數(shù)。服務(wù)器3臺(tái),每臺(tái)配置2G內(nèi)存,1核CPU,2線程。
分析:0.5G/128m=4MapTask;1ReduceTask;1MrAppMater
平均每個(gè)節(jié)點(diǎn)運(yùn)行6/3=2個(gè)任務(wù)(2,2,2)
修改yarn-site.xml配置:
<!-- 選擇調(diào)度器,默認(rèn)容量 -->
<property>
<description>The class to use as the resource scheduler.</description>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>
<!-- ResourceManager處理調(diào)度器請(qǐng)求的線程數(shù)量,默認(rèn)50;如果提交的任務(wù)數(shù)大于50,可以增加該值,但是不能超過(guò)3臺(tái) * 2線程 = 6線程(去除其他應(yīng)用程序?qū)嶋H不能超過(guò)8) -->
<property>
<description>Number of threads to handle scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.client.thread-count</name>
<value>8</value>
</property>
<!-- 是否讓yarn自動(dòng)檢測(cè)硬件進(jìn)行配置,默認(rèn)是false,如果該節(jié)點(diǎn)有很多其他應(yīng)用程序,建議手動(dòng)配置。如果該節(jié)點(diǎn)沒(méi)有其他應(yīng)用程序,可以采用自動(dòng) -->
<property>
<description>Enable auto-detection of node capabilities such as
memory and CPU.
</description>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>false</value>
</property>
<!-- 是否將虛擬核數(shù)當(dāng)作CPU核數(shù),默認(rèn)是false,采用物理CPU核數(shù) -->
<property>
<description>Flag to determine if logical processors(such as
hyperthreads) should be counted as cores. Only applicable on Linux
when yarn.nodemanager.resource.cpu-vcores is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true.
</description>
<name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
<value>false</value>
</property>
<!-- 虛擬核數(shù)和物理核數(shù)乘數(shù),默認(rèn)是1.0 -->
<property>
<description>Multiplier to determine how to convert phyiscal cores to
vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
is set to -1(which implies auto-calculate vcores) and
yarn.nodemanager.resource.detect-hardware-capabilities is set to true. Thenumber of vcores will be calculated asnumber of CPUs * multiplier.
</description>
<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>
</property>
<!-- NodeManager使用內(nèi)存數(shù),默認(rèn)8G,修改為4G內(nèi)存 -->
<property>
<description>Amount of physical memory, in MB, that can be allocated
for containers. If set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically calculated(in case of Windows and Linux).
In other cases, the default is 8192MB.
</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<!-- nodemanager的CPU核數(shù),不按照硬件環(huán)境自動(dòng)設(shè)定時(shí)默認(rèn)是8個(gè),修改為1個(gè) -->
<property>
<description>Number of vcores that can be allocated
for containers. This is used by the RM scheduler when allocating
resources for containers. This is not used to limit the number of
CPUs used by YARN containers. If it is set to -1 and
yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
automatically determined from the hardware in case of Windows and Linux.
In other cases, number of vcores is 8 by default.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>1</value>
</property>
<!-- 容器最小內(nèi)存,默認(rèn)1G -->
<property>
<description>The minimum allocation for every container request at the RMin MBs. Memory requests lower than this will be set to the value of thisproperty. Additionally, a node manager that is configured to have less memorythan this value will be shut down by the resource manager.
</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<!-- 容器最大內(nèi)存,默認(rèn)8G,修改為1.5G -->
<property>
<description>The maximum allocation for every container request at the RMin MBs. Memory requests higher than this will throw anInvalidResourceRequestException.
</description>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1536</value>
</property>
<!-- 容器最小CPU核數(shù),默認(rèn)1個(gè) -->
<property>
<description>The minimum allocation for every container request at the RMin terms of virtual CPU cores. Requests lower than this will be set to thevalue of this property. Additionally, a node manager that is configured tohave fewer virtual cores than this value will be shut down by the resourcemanager.
</description>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<!-- 容器最大CPU核數(shù),默認(rèn)4個(gè),修改為1個(gè) -->
<property>
<description>The maximum allocation for every container request at the RMin terms of virtual CPU cores. Requests higher than this will throw an
InvalidResourceRequestException.</description>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>1</value>
</property>
<!-- 虛擬內(nèi)存檢查,默認(rèn)打開,修改為關(guān)閉 -->
<property>
<description>Whether virtual memory limits will be enforced for
containers.</description>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!-- 虛擬內(nèi)存和物理內(nèi)存設(shè)置比例,默認(rèn)2.1 -->
<property>
<description>Ratio between virtual memory to physical memory whensetting memory limits for containers. Container allocations areexpressed in terms of physical memory, and virtual memory usageis allowed to exceed this allocation by this ratio.
</description>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
當(dāng)集群硬件資源不一樣時(shí)需要每個(gè)NodeManager單獨(dú)配置。
分發(fā)并重啟yarn集群,然后執(zhí)行wordcount程序,http://hadoop103:8088/cluster/apps 觀察yarn的執(zhí)行。
xsync yarn-site.xml
stop-yarn.sh
start-yarn.sh
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount /input /output
(2)容量調(diào)度器案例
1) 需要?jiǎng)?chuàng)建隊(duì)列的生產(chǎn)環(huán)境情況:
(1)調(diào)度器只默認(rèn)一個(gè)default隊(duì)列,不能滿足生產(chǎn)要求。
(2)按照框架,hive /spark/ flink 每個(gè)框架的任務(wù)放入指定的隊(duì)列。
(3)按照業(yè)務(wù)模塊:登錄注冊(cè)、購(gòu)物車、下單、業(yè)務(wù)部門1、業(yè)務(wù)部門2。
2)多隊(duì)列好處:
(1)可能出現(xiàn)死循環(huán),耗盡隊(duì)列的資源。
(2)像雙十一、618在資源緊張時(shí)期保證任務(wù)隊(duì)列資源充足,給任務(wù)設(shè)置優(yōu)先級(jí),優(yōu)先級(jí)高的先處理,即對(duì)任務(wù)進(jìn)行降級(jí)使用。
需求:
(1)default隊(duì)列占總內(nèi)存的40%,最大資源容量占總資源60%,hive隊(duì)列占總內(nèi)存的60%,最大資源容量占總資源80%。
(2)配置隊(duì)列優(yōu)先級(jí)。
配置步驟:
(1)capacity-scheduler.xml
修改以下:
<!-- 指定多隊(duì)列,增加hive隊(duì)列 -->
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>default,hive</value>
<description>
The queues at the this level (root is the root queue).
</description>
</property>
<!-- 降低default隊(duì)列資源額定容量為40%,默認(rèn)100%-->
<property>
<name>yarn.scheduler.capacity.root.default.capacity</name>
<value>40</value>
</property>
<!-- 降低default隊(duì)列資源最大容量為60%,默認(rèn)100%-->
<property>
<name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
<value>60</value>
</property>
添加參數(shù):
<!-- 指定hive隊(duì)列的資源額定容量 -->
<property>
<name>yarn.scheduler.capacity.root.hive.capacity</name>
<value>60</value>
</property>
<!-- 用戶最多可以使用隊(duì)列多少資源,1表示 -->
<property>
<name>yarn.scheduler.capacity.root.hive.user-limit-factor</name>
<value>1</value>
</property>
<!-- 指定hive隊(duì)列的資源最大容量 -->
<property>
<name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
<value>80</value>
</property>
<!-- 啟動(dòng)hive隊(duì)列 -->
<property>
<name>yarn.scheduler.capacity.root.hive.state</name>
<value>RUNNING</value>
</property>
<!-- 哪些用戶有權(quán)向隊(duì)列提交作業(yè) -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
<value>*</value>
</property>
<!-- 哪些用戶有權(quán)操作隊(duì)列,管理員權(quán)限(查看/殺死) -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_administer_queue</name>
<value>*</value>
</property>
<!-- 哪些用戶有權(quán)配置提交任務(wù)優(yōu)先級(jí) -->
<property>
<name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</name>
<value>*</value>
</property>
<!-- 任務(wù)的超時(shí)時(shí)間設(shè)置:yarnapplication -appId appId -updateLifetimeTimeout
參考資料:https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/ -->
<!-- 如果application指定了超時(shí)時(shí)間,則提交到該隊(duì)列的application能夠指定的最大超時(shí)時(shí)間不能超過(guò)該值。
-->
<property>
<name>yarn.scheduler.capacity.root.hive.maximum-application-lifetime</name>
<value>-1</value>
</property>
<!-- 如果application沒(méi)指定超時(shí)時(shí)間,則用default-application-lifetime作為默認(rèn)值 -->
<property>
<name>yarn.scheduler.capacity.root.hive.default-application-lifetime</name>
<value>-1</value></property>
(2)分配文件,再重啟yarn(或執(zhí)行yarn rmadmin -refreshQueues刷新隊(duì)列)
xsync capacity-scheduler.xml
stop-yarn.sh
start-yarn.sh
可以看到兩個(gè)隊(duì)列
(3)向hive隊(duì)列提交任務(wù)
hadoop jar /share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount -Dmapreduce.jar.queuename=hive /input /output
或用打jar包的方式,默認(rèn)任務(wù)提交到default隊(duì)列,若需要提交到別的隊(duì)列中,則需要在驅(qū)動(dòng)程序中聲明:
public class WcDrvier {
public static void main(String[] args) throws IOException,ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
//聲明任務(wù)提交到的隊(duì)列
conf.set("mapreduce.job.queuename","hive");
//1. 獲取一個(gè)Job實(shí)例
Job job = Job.getInstance(conf);
。。。 。。。
//6. 提交Job
boolean b = job.waitForCompletion(true);
System.exit(b ? 0 : 1);
}
}
(4)任務(wù)優(yōu)先級(jí)配置
容量調(diào)度器,支持任務(wù)優(yōu)先級(jí)的配置,在資源緊張時(shí),優(yōu)先級(jí)高的任務(wù)將優(yōu)先獲取資源。默認(rèn)情況,Yarn將所有任務(wù)的優(yōu)先級(jí)限制為0,若想使用任務(wù)的優(yōu)先級(jí)功能,須開放該限制。
在yarn-site.xml添加:
<property>
<name>yarn.cluster.max-application-priority</name>
<value>5</value>
</property>
分配文件且重啟yarn
xsync yarn-site.xml
stop-yarn.sh
start-yarn.sh
分別使用hadoop102、hadoop103、hadoop104提交以下任務(wù),模擬資源緊張的環(huán)境
hadoop jar /share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar pi 5 10000
再提交一個(gè)優(yōu)先級(jí)高的任務(wù)
hadoop jar /share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar pi -Dmapreduce.job.priority=5 5 10000
或者用以下命令修改任務(wù)的優(yōu)先級(jí)
yarn application -appID application_任務(wù)ID號(hào) -updatePriority 優(yōu)先級(jí)
(3)公平調(diào)度器案例
需求:創(chuàng)建兩個(gè)隊(duì)列test和liaoyanxia,實(shí)現(xiàn)若用戶提交任務(wù)時(shí)指定隊(duì)列則提交到指定隊(duì)列運(yùn)行;若不指定隊(duì)列則test用戶提交到root.group.test隊(duì)列運(yùn)行,liaoyanxia用戶提交到root.group.liaoyanxiat隊(duì)列運(yùn)行。
需要配置yarn-site.xml文件和創(chuàng)建配置fair-scheduler.xml。
在yarn-site.xml文件添加參數(shù):
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
<description>配置使用公平調(diào)度器</description>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/opt/module/hadoop-3.3.1/etc/hadoop/fair-scheduler.xml</value>
<description>指明公平調(diào)度器隊(duì)列分配配置文件</description>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>false</value>
<description>禁止隊(duì)列間資源搶占</description>
</property>
配置fair-scheduler.xml
<?xmlversion="1.0"?>
<allocations>
<!-- 單個(gè)隊(duì)列中ApplicationMaster占用資源的最大比例,取值0-1 ,企業(yè)一般配置0.1 -->
<queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
<!-- 單個(gè)隊(duì)列最大資源的默認(rèn)值 test liaoyanxia default -->
<queueMaxResourcesDefault>2048mb,2vcores</queueMaxResourcesDefault>
<!-- 增加一個(gè)隊(duì)列test -->
<queue name="test">
<!-- 隊(duì)列最小資源 -->
<minResources>1024mb,1vcores</minResources>
<!-- 隊(duì)列最大資源 -->
<maxResources>2048mb,1vcores</maxResources>
<!-- 隊(duì)列中最多同時(shí)運(yùn)行的應(yīng)用數(shù),默認(rèn)50,根據(jù)線程數(shù)配置 -->
<maxRunningApps>2</maxRunningApps>
<!-- 隊(duì)列中Application Master占用資源的最大比例 -->
<maxAMShare>0.5</maxAMShare>
<!-- 該隊(duì)列資源權(quán)重,默認(rèn)值為1.0 -->
<weight>1.0</weight>
<!-- 隊(duì)列內(nèi)部的資源分配策略 -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<!-- 增加一個(gè)隊(duì)列l(wèi)iaoyanxia -->
<queue name="liaoyanxia"type="parent">
<!-- 隊(duì)列最小資源 -->
<minResources>1024mb,1vcores</minResources>
<!-- 隊(duì)列最大資源 -->
<maxResources>2048mb,1vcores</maxResources>
<!-- 隊(duì)列中最多同時(shí)運(yùn)行的應(yīng)用數(shù),默認(rèn)50,根據(jù)線程數(shù)配置 -->
<maxRunningApps>2</maxRunningApps>
<!-- 隊(duì)列中Application Master占用資源的最大比例 -->
<maxAMShare>0.5</maxAMShare>
<!-- 該隊(duì)列資源權(quán)重,默認(rèn)值為1.0 -->
<weight>1.0</weight>
<!-- 隊(duì)列內(nèi)部的資源分配策略 -->
<schedulingPolicy>fair</schedulingPolicy>
</queue>
<!-- 任務(wù)隊(duì)列分配策略,可配置多層規(guī)則,從第一個(gè)規(guī)則開始匹配,直到匹配成功 -->
<queuePlacementPolicy>
<!-- 提交任務(wù)時(shí)指定隊(duì)列,如未指定提交隊(duì)列,則繼續(xù)匹配下一個(gè)規(guī)則; false表示:如果指定隊(duì)列不存在,不允許自動(dòng)創(chuàng)建-->
<rule name="specified"create="false"/>
<!-- 提交到root.group.username隊(duì)列,若root.group不存在,不允許自動(dòng)創(chuàng)建;若root.group.user不存在,允許自動(dòng)創(chuàng)建-->
<rule name="nestedUserQueue"create="true">
<rule name="primaryGroup"create="false"/>
</rule>
<!-- 最后一個(gè)規(guī)則必須為reject或者default。Reject表示拒絕創(chuàng)建提交失敗,default表示把任務(wù)提交到default隊(duì)列 -->
<rule name="reject" />
</queuePlacementPolicy>
</allocations>
分發(fā)配置并重啟Yarn,提交任務(wù)進(jìn)行測(cè)試。文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-560319.html
xsync yarn-site.xml fair-scheduler.xml
sbin/stop-yarn.sh
sbin/start-yarn.sh
hadoop jar share/hadoop/mapredduce/hadoop-mapreduce-examples-3.3.1.jar pi -Dmapreduce-jar-queuename=root.test 1 1
文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-560319.html
到了這里,關(guān)于【大數(shù)據(jù)之Hadoop】二十三、Yarn命令行操作及生產(chǎn)環(huán)境下的配置的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!