生態(tài)擴(kuò)展Spark Doris Connector
doris官網(wǎng)去查找相匹配的spark
spark的安裝:
tar -zxvf spark-3.1.2-bin-hadoop3.2.tgz
mv spark-3.1.2-bin-hadoop3.2 /opt/spark
spark環(huán)境配置:vim /etc/profile
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin
將編譯好的spark-doris-connector-3.3_2.12-1.3.0-SNAPSHOT.jar復(fù)制到spark的jars目錄
cp spark-doris-connector-3.2_2.12-1.3.0-SNAPSHOT.jar /opt/spark/jars/
代碼庫地址:https://github.com/apache/doris-spark-connector
編譯與安裝
準(zhǔn)備工作
修改custom_env.sh.tpl文件,重命名為custom_env.sh
在源碼目錄下執(zhí)行: sh build.sh 根據(jù)提示輸入你需要的 Scala 2.12與 Spark3.2.3 版本進(jìn)行編譯。文章來源:http://www.zghlxwxcb.cn/news/detail-741167.html
驗證:文章來源地址http://www.zghlxwxcb.cn/news/detail-741167.html
scala> import org.apache.doris.spark._
import org.apache.doris.spark._
scala>
scala> val doris = spark.sql(
| s"""
| |CREATE TEMPORARY VIEW spark_doris
| |USING doris
| |OPTIONS(
| | "table.identifier"="demo.example_tbl",
| | "fenodes"="10.63.0.181:8030",
| | "user"="root",
| | "password"=""
| |);
| |""".stripMargin)
doris: org.apache.spark.sql.DataFrame = []
scala>
scala> spark.sql("SELECT * FROM spark_doris;").show
+-------+----------+----+---+---+-------------------+----+--------------+--------------+
|user_id| date|city|age|sex| last_visit_date|cost|max_dwell_time|min_dwell_time|
+-------+----------+----+---+---+-------------------+----+--------------+--------------+
| 10000|2017-10-01|北京| 20| 0|2017-10-01 07:00:00| 35| 10| 2|
| 10001|2017-10-01|北京| 30| 1|2017-10-01 17:05:45| 2| 22| 22|
| 10002|2017-10-02|上海| 20| 1|2017-10-02 12:59:12| 200| 5| 5|
| 10003|2017-10-02|廣州| 32| 0|2017-10-02 11:20:00| 30| 11| 11|
| 10004|2017-10-01|深圳| 35| 0|2017-10-01 10:00:15| 100| 3| 3|
| 10004|2017-10-03|深圳| 35| 0|2017-10-03 10:20:22| 11| 6| 6|
+-------+----------+----+---+---+-------------------+----+--------------+--------------+
scala>
到了這里,關(guān)于生態(tài)擴(kuò)展Spark Doris Connector的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!