国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

<noscript id="zbrz9"></noscript>

Hadoop-HA-Hive-on-Spark 4臺(tái)虛擬機(jī)安裝配置文件

2年前作者：輕夏分類：Toy博客閱讀(51)違法舉報(bào)

這篇具有很好參考價(jià)值的文章主要介紹了Hadoop-HA-Hive-on-Spark 4臺(tái)虛擬機(jī)安裝配置文件。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方，請(qǐng)大家不吝賜教，您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。

版本號(hào)

apache-hive-3.1.3-bin.tar
spark-3.0.0-bin-hadoop3.2.tgz
hadoop-3.1.3.tar.gz

步驟

在hdfs上新建
spark-history（設(shè)置權(quán)限777），spark-jars文件夾
上傳jar到hdfs

hdfs dfs -D dfs.replication=1 -put ./* /spark-jars

hadoop

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <!--指定hadoop集群在zookeeper上注冊(cè)的節(jié)點(diǎn)名-->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://hacluster</value>
        </property>

        <!--指定hadoop運(yùn)行時(shí)產(chǎn)生的臨時(shí)文件-->
        <property>
                <name>hadoop.tmp.dir</name>
                <value>file:///opt/hadoop-3.1.3/tmp</value>
        </property>

        <!--設(shè)置緩存大小 默認(rèn)4KB--> <property>
                <name>io.file.buffer.size</name>
                <value>4096</value>
        </property>

        <!--指定zookeeper的存放地址-->
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>node15:2181,node16:2181,node17:2181,node18:2181</value>
        </property>

        <!--配置允許root代理訪問主機(jī)節(jié)點(diǎn)-->
        <property>
                <name>hadoop.proxyuser.root.hosts</name>
                <value>*</value>
        </property>

        <!--配置該節(jié)點(diǎn)允許root用戶所屬的組-->
        <property>
                <name>hadoop.proxyuser.root.groups</name>
                <value>*</value>
        </property>
        <!-- 配置HDFS網(wǎng)頁登錄使用的靜態(tài)用戶為summer-->
        <property>
                <name>hadoop.http.staticuser.user</name>
                <value>root</value>
        </property>
</configuration>

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property> 
		<!--數(shù)據(jù)塊默認(rèn)大小128M--> 
		<name>dfs.block.size</name> 
		<value>134217728</value> 
	</property> 

	<property>
	    <name>dfs.nameservices</name>
	    <value>activeNode</value>
	</property>	

	<property> 
		<!--副本數(shù)量 不配置默認(rèn)為3--> 
		<name>dfs.replication</name> 
		<value>3</value> 
	</property> 
	
	<property> 
		<!--namenode節(jié)點(diǎn)數(shù)據(jù)(元數(shù)據(jù))的存放位置--> 
		<name>dfs.name.dir</name> 
		<value>file:///opt/hadoop-3.1.3/dfs/namenode_data</value> 
	</property>
	
	<property> 
		<!--datanode節(jié)點(diǎn)數(shù)據(jù)(元數(shù)據(jù))的存放位置--> 
		<name>dfs.data.dir</name> 
		<value>file:///opt/hadoop-3.1.3/dfs/datanode_data</value> 
	</property>
	
	<property>
		<!--開啟hdfs的webui界面--> 
		<name>dfs.webhdfs.enabled</name> 
		<value>true</value> 
	</property> 
	
	<property> 
		<!--datanode上負(fù)責(zé)進(jìn)行文件操作的線程數(shù)--> 
		<name>dfs.datanode.max.transfer.threads</name> 
		<value>4096</value> 
	</property> 
	
	<property> 
		<!--指定hadoop集群在zookeeper上的注冊(cè)名--> 
		<name>dfs.nameservices</name> 
		<value>hacluster</value> 
	</property> 
	
	<property> 
		<!--hacluster集群下有兩個(gè)namenode分別是nn1,nn2--> 
		<name>dfs.ha.namenodes.hacluster</name> 
		<value>nn1,nn2</value> 
	</property> 
	
	<!--nn1的rpc、servicepc和http通訊地址 --> 
	<property> 
		<name>dfs.namenode.rpc-address.hacluster.nn1</name> 
		<value>node15:9000</value> 
	</property>
	
	<property> 
		<name>dfs.namenode.servicepc-address.hacluster.nn1</name> 
		<value>node15:53310</value> 
	</property> 
	
	<property> 
		<name>dfs.namenode.http-address.hacluster.nn1</name> 
		<value>node15:50070</value> 
	</property> 
	
	<!--nn2的rpc、servicepc和http通訊地址 --> 
	<property> 
		<name>dfs.namenode.rpc-address.hacluster.nn2</name> 
		<value>node16:9000</value> 
	</property> 
	
	<property> 
		<name>dfs.namenode.servicepc-address.hacluster.nn2</name> 
		<value>node16:53310</value> 
	</property> 
	
	<property> 
		<name>dfs.namenode.http-address.hacluster.nn2</name> 
		<value>node16:50070</value> 
	</property> 
	
	<property> 
	<!--指定Namenode的元數(shù)據(jù)在JournalNode上存放的位置--> 
		<name>dfs.namenode.shared.edits.dir</name> 
		<value>qjournal://node15:8485;node16:8485;node17:8485;node18:8485/hacluster</value> 
	</property> 
	
	<property> 
		<!--指定JournalNode在本地磁盤的存儲(chǔ)位置--> 
		<name>dfs.journalnode.edits.dir</name> 
		<value>/opt/hadoop-3.1.3/dfs/journalnode_data</value> 
	</property> 
	
	<property> 
		<!--指定namenode操作日志存儲(chǔ)位置--> 
		<name>dfs.namenode.edits.dir</name> 
		<value>/opt/hadoop-3.1.3/dfs/edits</value> 
	</property> 
	
	<property> 
		<!--開啟namenode故障轉(zhuǎn)移自動(dòng)切換--> 
		<name>dfs.ha.automatic-failover.enabled</name> 
		<value>true</value> 
	</property> 
	
	<property> 
		<!--配置失敗自動(dòng)切換實(shí)現(xiàn)方式--> 
		<name>dfs.client.failover.proxy.provider.hacluster</name> 
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 
	</property> 
	
	<property> 
		<!--配置隔離機(jī)制--> 
		<name>dfs.ha.fencing.methods</name> 
		<value>sshfence</value> 
	</property> 
	
	<property> 
		<!--配置隔離機(jī)制需要SSH免密登錄--> 
		<name>dfs.ha.fencing.ssh.private-key-files</name> 
		<value>/root/.ssh/id_rsa</value>
	</property> 
	
	<property> 
		<!--hdfs文件操作權(quán)限 false為不驗(yàn)證--> 
		<name>dfs.premissions</name> 
		<value>false</value> 
	</property> 

</configuration>

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<!-- 指定mapreduce使用yarn資源管理器-->
	<property>　　　　　　　　
		<name>mapred.job.tracker</name> 　　　　　　　　 　　　　　　　　
		<value>node15:9001</value> 　　　　　　
	</property>

	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<!-- 配置作業(yè)歷史服務(wù)器的地址-->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>node15:10020</value>
	</property>
	<!-- 配置作業(yè)歷史服務(wù)器的http地址-->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>node15:19888</value>
	</property>
	<property>
		<name>yarn.application.classpath</name>
		<value>/opt/hadoop-3.1.3/etc/hadoop:/opt/hadoop-3.1.3/share/hadoop/common/lib/*:/opt/hadoop-3.1.3/share/hadoop/common/*:/opt/hadoop-3.1.3/share/hadoop/hdfs:/opt/hadoop-3.1.3/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.3/share/hadoop/hdfs/*:/opt/hadoop-3.1.3/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.3/share/hadoop/mapreduce/*:/opt/hadoop-3.1.3/share/hadoop/yarn:/opt/hadoop-3.1.3/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.3/share/hadoop/yarn/*</value>
	</property>

	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	 
	<property>
		<name>mapreduce.map.memory.mb</name>
		<value>1500</value>
		<description>每個(gè)Map任務(wù)的物理內(nèi)存限制</description>
	</property>
	 
	<property>
		<name>mapreduce.reduce.memory.mb</name>
		<value>3000</value>
		<description>每個(gè)Reduce任務(wù)的物理內(nèi)存限制</description>
	</property>
	 
	<property>
		<name>mapreduce.map.java.opts</name>
		<value>-Xmx1200m</value>
	</property>
	 
	<property>
		<name>mapreduce.reduce.java.opts</name>
		<value>-Xmx2600m</value>
	</property>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>

</configuration>

slaves

node15
node16
node17
node18

workers

node15
node16
node17
node18

yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
	<property>
		<!-- 是否對(duì)容器強(qiáng)制執(zhí)行虛擬內(nèi)存限制 -->
		<name>yarn.nodemanager.vmem-check-enabled</name>
		<value>false</value>
		<description>Whether virtual memory limits will be enforced for containers</description>
	</property>
	<property>
		<!-- 為容器設(shè)置內(nèi)存限制時(shí)虛擬內(nèi)存與物理內(nèi)存之間的比率 -->
		<name>yarn.nodemanager.vmem-pmem-ratio</name>
		<value>4</value>
		<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
	</property>

	<property> 
		<!--開啟yarn高可用--> 
		<name>yarn.resourcemanager.ha.enabled</name> 
		<value>true</value> 
	</property> 
	
	<property> 
		<!-- 指定Yarn集群在zookeeper上注冊(cè)的節(jié)點(diǎn)名--> 
		<name>yarn.resourcemanager.cluster-id</name> 
		<value>hayarn</value> 
	</property> 
	
	<property> 
		<!--指定兩個(gè)resourcemanager的名稱--> 
		<name>yarn.resourcemanager.ha.rm-ids</name> 
		<value>rm1,rm2</value> 
	</property> 
	
	<property> 
		<!--指定rm1的主機(jī)--> 
		<name>yarn.resourcemanager.hostname.rm1</name> 
		<value>node15</value> 
	</property>
	
	<property> 
		<!--指定rm2的主機(jī)--> 
		<name>yarn.resourcemanager.hostname.rm2</name> 
		<value>node16</value> 
	</property> 

	<property>
		<!-- RM HTTP訪問地址 默認(rèn)：${yarn.resourcemanager.hostname}:8088-->
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>node15:8088</value>
	</property>
	<property>
		<!-- RM HTTP訪問地址 默認(rèn)：${yarn.resourcemanager.hostname}:8088-->
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>node16:8088</value>
	</property>
	
	<property> 
		<!--配置zookeeper的地址--> 
		<name>yarn.resourcemanager.zk-address</name> 
		<value>node15:2181,node16:2181,node17:2181</value> 
	</property> 
	
	<property> 
		<!--開啟yarn恢復(fù)機(jī)制--> 
		<name>yarn.resourcemanager.recovery.enabled</name> 
		<value>true</value> 
	</property> 
	
	<property> 
		<!--配置執(zhí)行resourcemanager恢復(fù)機(jī)制實(shí)現(xiàn)類--> 
		<name>yarn.resourcemanager.store.class</name> 
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> 
	</property> 
	
	<property> 
		<!--指定主resourcemanager的地址--> 
		<name>yarn.resourcemanager.hostname</name> 
		<value>node18</value> 
	</property> 
	
	<property> 
		<!--nodemanager獲取數(shù)據(jù)的方式--> 
		<name>yarn.nodemanager.aux-services</name> 
		<value>mapreduce_shuffle</value> 
	</property> 
	
	<property> 
		<!--開啟日志聚集功能--> 
		<name>yarn.log-aggregation-enable</name> 
		<value>true</value> 
	</property> 
	
	<property> 
		<!--配置日志保留7天--> 
		<name>yarn.log-aggregation.retain-seconds</name> 
		<value>604800</value> 
	</property> 
	<property>
    		<name>yarn.log.server.url</name>
    		<value>http://node15:19888/jobhistory/logs</value>
	</property>
</configuration>

hive

hive-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- jdbc連接的URL -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://node15:3306/metastore?useSSL=false</value>
        </property>

    <!-- jdbc連接的Driver-->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        </property>

        <!-- jdbc連接的username-->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <!-- jdbc連接的password -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hadoop</value>
        </property>

    <!-- Hive默認(rèn)在HDFS的工作目錄 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>

   <!-- Hive元數(shù)據(jù)存儲(chǔ)的驗(yàn)證 -->
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>

    <!-- 元數(shù)據(jù)存儲(chǔ)授權(quán)  -->
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>

    <!-- 指定hiveserver2連接的host -->
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>node15</value>
    </property>

    <!-- 指定hiveserver2連接的端口號(hào) -->
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>

    <property>
        <name>spark.yarn.jars</name>
        <value>hdfs://node15:9000/spark-jars/*</value>
    </property>

    <!--Hive執(zhí)行引擎-->
    <property>
        <name>hive.execution.engine</name>
        <value>spark</value>
    </property>
    <property>
        <name>spark.home</name>
        <value>/opt/spark-3.0.0-bin-hadoop3.2/</value>
    </property>
</configuration>

spark-defaults.conf

spark.master                               yarn
spark.eventLog.enabled                   true
spark.eventLog.dir                        hdfs://node15:9000/spark-history
spark.executor.memory                    600m
spark.driver.memory                     600m

spark

hdfs-site.xml

鏈接hadoop中的文件
ln -s 源文件名新文件名

hive-site.xml

鏈接hive中的文件
ln -s 源文件名新文件名

slaves

node15
node16
node17
node18

yarn-site.xml

鏈接hadoop中的文件
ln -s 源文件名新文件名文章來源地址http://www.zghlxwxcb.cn/news/detail-725667.html

spark-env.sh

#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


export SCALA_HOME=/usr/share/scala
export JAVA_HOME=/usr/java/jdk1.8.0_241-amd64
export SPARK_HOME=/opt/spark-3.0.0-bin-hadoop3.2
export SPARK_MASTER_IP=192.168.206.215
export SPARK_MASTER_PORT=7077
export SPARK_MASTER_WEBUI_PORT=7080     #spark的web訪問端口默認(rèn)是8080，防止可能存在端口沖突，可以修
改端口號(hào)為其他的export SPARK_WORKER_CORES=1
export SPARK_WORKER_INSTANCES=1
export SPARK_EXECUTOR_MEMORY=512M
export SPARK_WORKER_MEMORY=1G
export SPARK_DIST_CLASSPATH=$(/opt/hadoop-3.1.3/bin/hadoop classpath)
export HADOOP_CONF_DIR=/opt/hadoop-3.1.3/etc/hadoop


# This file is sourced when running various Spark programs.
# Copy it as spark-env.sh and edit that to configure Spark for your site.

# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program

# Options read by executors and drivers running inside the cluster
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos

# Options read in YARN client/cluster mode
# - SPARK_CONF_DIR, Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - YARN_CONF_DIR, to point Spark towards YARN configuration files when you use YARN
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)

# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_HOST, to bind the master to a different IP address or hostname
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")
# - SPARK_WORKER_CORES, to set the number of cores to use on this machine
# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2
g)# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
# - SPARK_WORKER_DIR, to set the working directory of worker processes
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")
# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 
1g).# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")
# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y
")# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")
# - SPARK_DAEMON_CLASSPATH, to set the classpath for all daemons
# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers

# Options for launcher
# - SPARK_LAUNCHER_OPTS, to set config properties and Java options for the launcher (e.g. "-Dx=y")

# Generic options for the daemons used in the standalone deploy mode
# - SPARK_CONF_DIR      Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - SPARK_LOG_DIR       Where log files are stored.  (Default: ${SPARK_HOME}/logs)
# - SPARK_PID_DIR       Where the pid file is stored. (Default: /tmp)
# - SPARK_IDENT_STRING  A string representing this instance of spark. (Default: $USER)
# - SPARK_NICENESS      The scheduling priority for daemons. (Default: 0)
# - SPARK_NO_DAEMONIZE  Run the proposed command in the foreground. It will not output a PID file.
# Options for native BLAS, like Intel MKL, OpenBLAS, and so on.
# You might get better performance to enable these options if using native BLAS (see SPARK-21305).
# - MKL_NUM_THREADS=1        Disable multi-threading of Intel MKL
# - OPENBLAS_NUM_THREADS=1   Disable multi-threading of OpenBLAS

到了這里，關(guān)于Hadoop-HA-Hive-on-Spark 4臺(tái)虛擬機(jī)安裝配置文件的文章就介紹完了。如果您還想了解更多內(nèi)容，請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點(diǎn)僅代表作者本人，不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請(qǐng)注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符，請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋，一經(jīng)查實(shí)，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

了解hive on spark和spark on hive
? ? ? ? 大數(shù)據(jù)剛出來的時(shí)候，并不是很完善。發(fā)展的不是很快，尤其是在計(jì)算服務(wù)上，當(dāng)時(shí)使用的是第一代mr計(jì)算引擎，相對(duì)來說計(jì)算并不是那么快。讓大數(shù)據(jù)快速發(fā)展的是2009年伯克利大學(xué)誕生的spark，并在2013年成為Aparch的頂級(jí)開源項(xiàng)目。使大數(shù)據(jù)發(fā)展比較迅速、但是隨著
2024年02月14日
瀏覽(18)
構(gòu)建大數(shù)據(jù)環(huán)境：Hadoop、MySQL、Hive、Scala和Spark的安裝與配置
在當(dāng)今的數(shù)據(jù)驅(qū)動(dòng)時(shí)代，構(gòu)建一個(gè)強(qiáng)大的大數(shù)據(jù)環(huán)境對(duì)于企業(yè)和組織來說至關(guān)重要。本文將介紹如何安裝和配置Hadoop、MySQL、Hive、Scala和Spark，以搭建一個(gè)完整的大數(shù)據(jù)環(huán)境。安裝Hadoop 首先，從Apache Hadoop的官方網(wǎng)站下載所需的Hadoop發(fā)行版。選擇適合你系統(tǒng)的二進(jìn)制發(fā)行版，下
2024年02月11日
瀏覽(31)
大數(shù)據(jù)平臺(tái)安裝實(shí)驗(yàn): ZooKeeper、Kafka、Hadoop、Hbase、Hive、Scala、Spark、Storm
? 在大數(shù)據(jù)時(shí)代，存在很多開源的分布式數(shù)據(jù)采集、計(jì)算、存儲(chǔ)技術(shù)，本實(shí)驗(yàn)將在熟練掌握幾種常見Linux命令的基礎(chǔ)上搭建幾種常用的大數(shù)據(jù)采集、處理分析技術(shù)環(huán)境。相關(guān)安裝包下載：鏈接：https://pan.baidu.com/s/1Wa2U3qstc54IAUCypcApSQ 提取碼：lcd8 Hadoop大數(shù)據(jù)平臺(tái)所需工具、軟件
2023年04月09日
瀏覽(38)
【大數(shù)據(jù)技術(shù)】Hive on spark 與Spark on hive 的區(qū)別與聯(lián)系
【大數(shù)據(jù)技術(shù)】Hive on spark 與Spark on hive 的區(qū)別與聯(lián)系 Hive on Spark Hive 既作為存儲(chǔ)元數(shù)據(jù)又負(fù)責(zé)sql的解析優(yōu)化，遵守HQL語法，執(zhí)行引擎變成了spark，底層由spark負(fù)責(zé)RDD操作 Spark on Hive Hive只作為存儲(chǔ)元數(shù)據(jù)，Spark負(fù)責(zé)sql的解析優(yōu)化，遵守spark sql 語法，底層由spark負(fù)責(zé)RDD操作
2024年02月15日
瀏覽(19)
spark on hive
注意：需要提前搭建好hive，并對(duì)hive進(jìn)行配置。并將spark配置成為spark on yarn模式。提前創(chuàng)建好啟動(dòng)日志存放路徑 mkdir $HIVE_HOME/logStart 注意：其實(shí)還是hive的thirftserver服務(wù)，同時(shí)還需要啟動(dòng)spark集群連接thirftserver服務(wù)后，就可以使用hive的元數(shù)據(jù)（hive在hdfs中的數(shù)據(jù)庫和表），并且
2024年02月07日
瀏覽(23)
Hive on Spark （1）
在 Apache Spark 中，Executor 是分布式計(jì)算框架中的一個(gè)關(guān)鍵組件，用于在集群中執(zhí)行具體的計(jì)算任務(wù) 。每個(gè) Executor 都在獨(dú)立的 JVM 進(jìn)程中運(yùn)行，可以在集群的多臺(tái)機(jī)器上同時(shí)存在。Executors 負(fù)責(zé)實(shí)際的數(shù)據(jù)處理、計(jì)算和任務(wù)執(zhí)行，它們執(zhí)行來自 Driver 的指令，并將計(jì)算結(jié)果返回給
2024年02月12日
瀏覽(26)
SparkSQL與Hive整合(Spark On Hive）
hive metastore元數(shù)據(jù)服務(wù)用來存儲(chǔ)元數(shù)據(jù)，所謂元數(shù)據(jù)，即hive中庫、表、字段、字段所屬表、表所屬庫、表的數(shù)據(jù)所在目錄及數(shù)據(jù)分區(qū)信息。元數(shù)據(jù)默認(rèn)存儲(chǔ)在hive自帶的Derby數(shù)據(jù)庫。在內(nèi)嵌模式和本地模式下，metastore嵌入在主hive server進(jìn)程中。但在遠(yuǎn)程模式下，metastore 和 hive
2024年02月12日
瀏覽(27)
hive on spark內(nèi)存模型
hive on spark的調(diào)優(yōu),那必然涉及到這一系列框架的內(nèi)存模型。本章就是來講一下這些框架的內(nèi)存模型。 hive on spark的任務(wù)，從開始到結(jié)束?？偣采婕傲?個(gè)框架。分別是：yarn、hive、spark 其中，hive只是一個(gè)客戶端的角色。就不涉及任務(wù)運(yùn)行時(shí)的內(nèi)存。所以這里主要講的yarn和spark的
2024年04月16日
瀏覽(18)
Hive on Spark環(huán)境搭建
Hive 引擎包括：默認(rèn) MR、tez、spark 最底層的引擎就是MR （Mapreduce）無需配置，Hive運(yùn)行自帶 Hive on Spark：Hive 既作為存儲(chǔ)元數(shù)據(jù)又負(fù)責(zé) SQL 的解析優(yōu)化，語法是 HQL 語法，執(zhí)行引擎變成了 Spark，Spark 負(fù)責(zé)采用 RDD 執(zhí)行。 Spark on Hive : Hive 只作為存儲(chǔ)元數(shù)據(jù)，Spark 負(fù)責(zé) SQL 解析優(yōu)化，語
2024年02月13日
瀏覽(18)
Spark On Hive原理和配置
目錄一、Spark On Hive原理 ? ? ? ? （1）為什么要讓Spark On Hive？二、MySQL安裝配置（root用戶） ? ? ? ? （1）安裝MySQL ? ? ? ? （2）啟動(dòng)MySQL設(shè)置開機(jī)啟動(dòng) ? ? ? ? （3）修改MySQL密碼三、Hive安裝配置 ? ? ? ? （1）修改Hadoop的core-site.xml ? ? ? ? （2）創(chuàng)建hive-site.xml ? ? ? ?
2024年02月08日
瀏覽(25)

<rp id="mrsls"><dl id="mrsls"></dl></rp>

<span id="mrsls"></span>