国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

Hadoop3教程（二十四）：Yarn的常用命令與參數(shù)配置實例

2年前作者：經(jīng)年藏殊分類：Toy博客閱讀(75)違法舉報

這篇具有很好參考價值的文章主要介紹了Hadoop3教程（二十四）：Yarn的常用命令與參數(shù)配置實例。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

本章我是僅做了解，所以很多地方并沒有深入去探究，用處估計不大，可酌情參考。

（132）YARN常用命令

查看任務

列出所有Application：yarn application -list

根據(jù)Application狀態(tài)過濾出指定Application，如過濾出已完成的Application：yarn application -list -appStates FINISHED

Application的狀態(tài)有：ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED。

殺掉某個Application：yarn application -kill application-id

其中，application_id是一串形如application_1612577921195_0001的字符串。

列出所有Application嘗試的列表：yarn applicationattempt -list <ApplicationId>

打印ApplicationAttempt的狀態(tài)：yarn applicationattempt -status <applicationAttemptId>

查看日志

非常重要。

查詢某個Application的日志：yarn logs -applicationId <application-id>

查詢container日志：yarn logs -applicationId <ApplicationId> -containerId <ContainerId>

查看容器

列出所有容器：yarn container -list <ApplicationAttemptId>

打印容器狀態(tài)：yarn container -status <ContainerId>

只有在任務運行的時候，才能看到container的狀態(tài)

查看節(jié)點狀態(tài)

列出所有節(jié)點：yarn node -list -all

就是打印出集群下所有服務器節(jié)點的運行狀態(tài)和地址信息啥的。

rmadmin更新配置

加載隊列配置：yarn rmadmin -refreshQueues

可以實現(xiàn)對隊列配置信息的動態(tài)的修改，無需停機。

查看隊列

打印隊列信息：yarn queue -status <QueueName>

比如說yarn queue -status default，就是打印默認的隊列

會打印出隊列的狀態(tài)、當前容量等等。

（133）生產(chǎn)環(huán)境核心配置參數(shù)

同樣僅做了解，所以直接截教程的圖了：

Hadoop3教程（二十四）：Yarn的常用命令與參數(shù)配置實例,大數(shù)據(jù)技術(shù),大數(shù)據(jù),hadoop,yarn

RM默認并發(fā)是50線程

這里有個"虛擬核數(shù)"的概念，需要簡單介紹一下。

首先需要知道，集群里每個NM都有自己的一套配置參數(shù)，并不嚴格要求每個NodeManager的配置參數(shù)都必須是一樣的。

這樣做主要是考慮到節(jié)點間性能差異較大的情況。比如說節(jié)點1的單核CPU性能是節(jié)點2單核CPU性能的兩倍，那么將二者一視同仁來分配任務的話就有問題了。這時候就可以開啟節(jié)點1的虛擬核功能，把一個物理核視為兩個虛擬核，這時候，節(jié)點1和節(jié)點2的單核（虛擬核）CPU性能就接近了，也方便RM來分配任務。

即不同NM的話，一個物理核數(shù)作為幾個虛擬核數(shù)來使用，是不一樣的。這樣做是為了防止因節(jié)點CPU性能不同，不好統(tǒng)一管理各個CPU。

所以，如果有CPU混搭的情況，如有節(jié)點是i5，有節(jié)點是i7這種，是有需要開啟虛擬核的。

“物理內(nèi)存檢查機制”，是為了防止節(jié)點內(nèi)存超出導致崩潰，默認打開；

（135）生產(chǎn)環(huán)境核心參數(shù)配置案例

需求：從1G數(shù)據(jù)中，統(tǒng)計每個單詞出現(xiàn)次數(shù)。服務器3臺，每臺配置4G內(nèi)存，4核CPU，4線程。

塊大小使用默認的128M，1G/128M=8，所以整個任務需要啟用8個MapTask，1個ReduceTask，以及1個MrAppMaster。

平均每個節(jié)點運行（8+1+1）/3臺約等于 3個任務，假設采用4+3+3分布。

基于以上需求和硬件條件，可以做出如下思考：

1G數(shù)據(jù)量不大，可以使用容量調(diào)度器；

RM處理調(diào)度器的線程數(shù)量默認50，太大了，沒必要，可以削成8；

不同節(jié)點CPU性能一致，不需要開啟虛擬核；

其他配置暫且不表。

直接把教程里的yarn-site.xml配置參數(shù)貼出來吧，方便之后查看。

<!-- 選擇調(diào)度器，默認容量 -->
<property>
	<description>The class to use as the resource scheduler.</description>
	<name>yarn.resourcemanager.scheduler.class</name>
	<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

<!-- ResourceManager處理調(diào)度器請求的線程數(shù)量,默認50；如果提交的任務數(shù)大于50，可以增加該值，但是不能超過3臺 * 4線程 = 12線程（去除其他應用程序?qū)嶋H不能超過8） -->
<property>
	<description>Number of threads to handle scheduler interface.</description>
	<name>yarn.resourcemanager.scheduler.client.thread-count</name>
	<value>8</value>
</property>

<!-- 是否讓yarn自動檢測硬件進行配置，默認是false，如果該節(jié)點有很多其他應用程序，建議手動配置。如果該節(jié)點沒有其他應用程序，可以采用自動 -->
<property>
	<description>Enable auto-detection of node capabilities such as
	memory and CPU.
	</description>
	<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
	<value>false</value>
</property>

<!-- 是否將虛擬核數(shù)當作CPU核數(shù)，默認是false，采用物理CPU核數(shù) -->
<property>
	<description>Flag to determine if logical processors(such as
	hyperthreads) should be counted as cores. Only applicable on Linux
	when yarn.nodemanager.resource.cpu-vcores is set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true.
	</description>
	<name>yarn.nodemanager.resource.count-logical-processors-as-cores</name>
	<value>false</value>
</property>

<!-- 虛擬核數(shù)和物理核數(shù)乘數(shù)，默認是1.0 -->
<property>
	<description>Multiplier to determine how to convert phyiscal cores to
	vcores. This value is used if yarn.nodemanager.resource.cpu-vcores
	is set to -1(which implies auto-calculate vcores) and
	yarn.nodemanager.resource.detect-hardware-capabilities is set to true. The	number of vcores will be calculated as	number of CPUs * multiplier.
	</description>
	<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
	<value>1.0</value>
</property>

<!-- NodeManager使用內(nèi)存數(shù)，默認8G，修改為4G內(nèi)存 -->
<property>
	<description>Amount of physical memory, in MB, that can be allocated 
	for containers. If set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
	automatically calculated(in case of Windows and Linux).
	In other cases, the default is 8192MB.
	</description>
	<name>yarn.nodemanager.resource.memory-mb</name>
	<value>4096</value>
</property>

<!-- nodemanager的CPU核數(shù)，不按照硬件環(huán)境自動設定時默認是8個，修改為4個 -->
<property>
	<description>Number of vcores that can be allocated
	for containers. This is used by the RM scheduler when allocating
	resources for containers. This is not used to limit the number of
	CPUs used by YARN containers. If it is set to -1 and
	yarn.nodemanager.resource.detect-hardware-capabilities is true, it is
	automatically determined from the hardware in case of Windows and Linux.
	In other cases, number of vcores is 8 by default.</description>
	<name>yarn.nodemanager.resource.cpu-vcores</name>
	<value>4</value>
</property>

<!-- 容器最小內(nèi)存，默認1G -->
<property>
	<description>The minimum allocation for every container request at the RM	in MBs. Memory requests lower than this will be set to the value of this	property. Additionally, a node manager that is configured to have less memory	than this value will be shut down by the resource manager.
	</description>
	<name>yarn.scheduler.minimum-allocation-mb</name>
	<value>1024</value>
</property>

<!-- 容器最大內(nèi)存，默認8G，修改為2G -->
<property>
	<description>The maximum allocation for every container request at the RM	in MBs. Memory requests higher than this will throw an	InvalidResourceRequestException.
	</description>
	<name>yarn.scheduler.maximum-allocation-mb</name>
	<value>2048</value>
</property>

<!-- 容器最小CPU核數(shù)，默認1個 -->
<property>
	<description>The minimum allocation for every container request at the RM	in terms of virtual CPU cores. Requests lower than this will be set to the	value of this property. Additionally, a node manager that is configured to	have fewer virtual cores than this value will be shut down by the resource	manager.
	</description>
	<name>yarn.scheduler.minimum-allocation-vcores</name>
	<value>1</value>
</property>

<!-- 容器最大CPU核數(shù)，默認4個，修改為2個 -->
<property>
	<description>The maximum allocation for every container request at the RM	in terms of virtual CPU cores. Requests higher than this will throw an
	InvalidResourceRequestException.</description>
	<name>yarn.scheduler.maximum-allocation-vcores</name>
	<value>2</value>
</property>

<!-- 虛擬內(nèi)存檢查，默認打開，修改為關(guān)閉 -->
<property>
	<description>Whether virtual memory limits will be enforced for
	containers.</description>
	<name>yarn.nodemanager.vmem-check-enabled</name>
	<value>false</value>
</property>

<!-- 虛擬內(nèi)存和物理內(nèi)存設置比例,默認2.1 -->
<property>
	<description>Ratio between virtual memory to physical memory when	setting memory limits for containers. Container allocations are	expressed in terms of physical memory, and virtual memory usage	is allowed to exceed this allocation by this ratio.
	</description>
	<name>yarn.nodemanager.vmem-pmem-ratio</name>
	<value>2.1</value>
</property>

（140/141）Tool接口案例

生產(chǎn)環(huán)境下比較有用的一個功能。僅做了解吧，本節(jié)我其實并沒有深入，只做了簡單的復制。

通過tools接口，可以實現(xiàn)我們自己程序的參數(shù)的動態(tài)修改

接下來以自定義實現(xiàn)WordCount為例。

在編寫代碼的時候，pom.xml里要引入：

<dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.3</version>
        </dependency>
    </dependencies>

創(chuàng)建類WordCount，并實現(xiàn)Tool接口：

package com.atguigu.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;

import java.io.IOException;

public class WordCount implements Tool {

    private Configuration conf;

    //核心驅(qū)動
    @Override
    public int run(String[] args) throws Exception {

        Job job = Job.getInstance(conf);

        job.setJarByClass(WordCountDriver.class);

        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        return job.waitForCompletion(true) ? 0 : 1;
    }

    @Override
    public void setConf(Configuration conf) {
        this.conf = conf;
    }

    @Override
    public Configuration getConf() {
        return conf;
    }

    public static class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

        private Text outK = new Text();
        private IntWritable outV = new IntWritable(1);

        @Override
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

            String line = value.toString();
            String[] words = line.split(" ");

            for (String word : words) {
                outK.set(word);

                context.write(outK, outV);
            }
        }
    }

    public static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable outV = new IntWritable();

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

            int sum = 0;

            for (IntWritable value : values) {
                sum += value.get();
            }
            outV.set(sum);

            context.write(key, outV);
        }
    }
}

新建WordCountDriver：

package com.atguigu.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import java.util.Arrays;

public class WordCountDriver {

    private static Tool tool;

    public static void main(String[] args) throws Exception {
        // 1. 創(chuàng)建配置文件
        Configuration conf = new Configuration();

        // 2. 判斷是否有tool接口
        switch (args[0]){
            case "wordcount":
                tool = new WordCount();
                break;
            default:
                throw new RuntimeException(" No such tool: "+ args[0] );
        }
        // 3. 用Tool執(zhí)行程序
        // Arrays.copyOfRange 將老數(shù)組的元素放到新數(shù)組里面
        // 相當于是拷貝從索引為1的參數(shù)到最后的參數(shù)
        int run = ToolRunner.run(conf, tool, Arrays.copyOfRange(args, 1, args.length));

        System.exit(run);
    }
}

然后執(zhí)行：

[atguigu@hadoop102 hadoop-3.1.3]$ yarn jar YarnDemo.jar com.atguigu.yarn.WordCountDriver wordcount /input /output

注意此時提交的3個參數(shù)，第一個用于生成特定的Tool，第二個和第三個為輸入輸出目錄。此時如果我們希望加入設置參數(shù)，可以在wordcount后面添加參數(shù)，例如：文章來源地址http://www.zghlxwxcb.cn/news/detail-721605.html

[atguigu@hadoop102 hadoop-3.1.3]$ yarn jar YarnDemo.jar com.atguigu.yarn.WordCountDriver wordcount -Dmapreduce.job.queuename=root.test /input /output1

參考文獻

【尚硅谷大數(shù)據(jù)Hadoop教程，hadoop3.x搭建到集群調(diào)優(yōu)，百萬播放】

到了這里，關(guān)于Hadoop3教程（二十四）：Yarn的常用命令與參數(shù)配置實例的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權(quán)，不承擔相關(guān)法律責任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領支付寶紅包贊助服務器費用

Hadoop3教程（三十四）：（生產(chǎn)調(diào)優(yōu)篇）MapReduce生產(chǎn)經(jīng)驗匯總
MR程序執(zhí)行效率的瓶頸，或者說當你覺得你的MR程序跑的比較慢的時候，可以從以下兩點來分析：計算機性能節(jié)點的CPU、內(nèi)存、磁盤、網(wǎng)絡等，這種屬于硬件上的檢查； IO操作上的檢查是否發(fā)生了數(shù)據(jù)傾斜？即單一reduce處理了絕大部分數(shù)據(jù) Map運行時間過長，導致Reduce一直在
2024年02月08日
瀏覽(27)
【大數(shù)據(jù)之Hadoop】二十三、Yarn命令行操作及生產(chǎn)環(huán)境下的配置
Yarn狀態(tài)查詢可以在hadoop103:8088頁面查看，也可以通過命令查看。先運行案例再查看運行情況。（1）yarn application 查看任務（2）yarn logs 查看日志（3） yarn application attempt 查看嘗試運行的任務（即正在運行的任務狀態(tài)）（4） yarn container查看容器只有任務在執(zhí)行過程中才能看
2024年02月16日
瀏覽(24)
Hadoop3教程（二十八）：（生產(chǎn)調(diào)優(yōu)篇）NN、DN的多目錄配置及磁盤間數(shù)據(jù)均衡
NN多目錄的意思是，本地目錄可以配置成多個，且每個目錄存放內(nèi)容相同，這樣的目的是增加可靠性。比如說下圖這樣：但其實生產(chǎn)中不常用哈，生產(chǎn)中要增加NN的可靠性的話，一般會開啟NN的高可用，即在不同節(jié)點上開啟多個NN，靠zookeeper來協(xié)調(diào) 。所以本節(jié)就了解一下即可
2024年02月08日
瀏覽(27)
Hadoop3教程（三十六）：（生產(chǎn)調(diào)優(yōu)篇）企業(yè)開發(fā)場景中的參數(shù)調(diào)優(yōu)案例概述
這章僅做興趣了解即可。需求：從1G數(shù)據(jù)中，統(tǒng)計每個單詞出現(xiàn)次數(shù)。服務器3臺，每臺配置4G內(nèi)存，4核CPU，4線程。需求分析： 1G / 128m = 8個MapTask；1個ReduceTask；1個mrAppMaster 平均每個節(jié)點運行10個 / 3臺 ≈ 3個任務（4 3 3）當然，這只是個案例演示，生產(chǎn)環(huán)境中一般是結(jié)合機器
2024年02月08日
瀏覽(25)
HadoopYarn常用命令、yarn application查看任務、yarn logs查看日志、yarn applicationattempt查看嘗試運行的任務、查看容器、Yarn生產(chǎn)環(huán)境核心參數(shù)
Yarn狀態(tài)的查詢，除了可以在hadoop103:8088頁面查看外，還可以通過命令操作。常見的命令操作如下所示：需求：執(zhí)行WordCount案例，并用Yarn命令查看任務運行情況。 [summer@hadoop102 ~]$ myhadoop.sh start 這個是之前寫的腳本，想了解的可用看我之前寫的文章 https://blog.csdn.net/Redamancy06/
2024年01月18日
瀏覽(57)
Hadoop3.0大數(shù)據(jù)處理學習3（MapReduce原理分析、日志歸集、序列化機制、Yarn資源調(diào)度器）
前言：如果想知道一堆牌中有多少張紅桃，直接的方式是一張張的檢查，并數(shù)出有多少張紅桃。而MapReduce的方法是，給所有的節(jié)點分配這堆牌，讓每個節(jié)點計算自己手中有幾張是紅桃，然后將這個數(shù)匯總，得到結(jié)果。官方介紹：MapReduce是一種分布式計算模型，由Google提出，
2024年02月08日
瀏覽(45)
【大數(shù)據(jù)基礎】Hadoop3.1.3安裝教程
來源： https://dblab.xmu.edu.cn/blog/2441/ 前言：重裝解決一切bug！事實上，問題中的絕大部分衍生問題都可以通過重裝解決。創(chuàng)建Hadoop用戶首先按 ctrl+alt+t 打開終端窗口，輸入如下命令創(chuàng)建新用戶 : 接著使用如下命令設置密碼，可簡單設置為 hadoop，按提示輸入兩次密碼：可為
2024年02月09日
瀏覽(56)
Hadoop(01) Hadoop3.3.6安裝教程，單機/偽分布式配置
在安裝 Hadoop 3.3.6 前，需要滿足以下前置條件： Java Development Kit (JDK)：Hadoop 是用 Java 編寫的，因此需要安裝并配置適當版本的 JDK。Hadoop 3.3.6 建議使用 JDK 8 或更高版本。確保正確安裝 JDK，并設置 JAVA_HOME 環(huán)境變量。 SSH：Hadoop 集群中的節(jié)點需要通過 SSH 進行通信和管理。確保在
2024年02月06日
瀏覽(22)
Hadoop3教程（二）：HDFS的定義及概述
隨著實際生產(chǎn)環(huán)境中的數(shù)據(jù)越來越大，在一臺服務器上無法存儲下所有的數(shù)據(jù)，那么就要把數(shù)據(jù)分散到多臺服務器的磁盤里存放。但是像這樣做跨服務器的數(shù)據(jù)管理和維護是很難的，所以就迫切需要一種方式，來協(xié)調(diào)管理多臺機器上的文件，這就是分布式文件管理系統(tǒng)。 HD
2024年02月07日
瀏覽(20)
Hadoop3.1.3安裝教程_單機/偽分布式配置_Hadoop3.1.3/Ubuntu18.04(16.04)
目錄前言：一、VMware Workstation 二、Ubuntu系統(tǒng)安裝新建虛擬機三、Ubuntu系統(tǒng)的配置四、更新apt 五、安裝SSH、配置SSH無密碼登陸? 六、安裝Java環(huán)境七、安裝 Hadoop3.1.3 八、Hadoop單機配置(非分布式) 九、Hadoop偽分布式配置前言： ????????本篇教程由作者本人進行修改，原教
2024年02月03日
瀏覽(56)

<strong id="col0c"><table id="col0c"><th id="col0c"></th></table></strong>

<fieldset id="col0c"></fieldset>