Kafka常見問題處理記錄
一、kafka創(chuàng)建topic失敗, Replication factor: 1 larger than available brokers: 0
1.創(chuàng)建語句如下所示,按照習慣在添加zookeeper參數(shù)的時候,指定了zxy:2181/kafka
,但是卻創(chuàng)建失敗,Error while executing topic command : Replication factor: 1 larger than available brokers: 0.
[root@zxy bin]# kafka-topics.sh --create --topic tp1 --zookeeper zxy:2181/kafka --partitions 3 --replication-factor 1
Error while executing topic command : Replication factor: 1 larger than available brokers: 0.
[2023-03-27 17:15:46,605] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.
(kafka.admin.TopicCommand$)
2.檢查各個broker的server.properties文件
發(fā)現(xiàn)在配置參數(shù)的時候,zookeeper.connect
指定的是zxy:2181,zxy:2182,zxy:2183
[root@zxy config]# cat server01.properties
broker.id=1
port=9091
listeners=PLAINTEXT://localhost:9091
log.dirs=/zxy/apps/kafkaCluster/kafkaLog01
zookeeper.connect=zxy:2181,zxy:2182,zxy:2183
3.指定zookeeper
參數(shù)為zxy:2181
,創(chuàng)建成功
[root@zxy bin]# kafka-topics.sh --create --topic tp1 --zookeeper zxy:2181 --partitions 3 --replication-factor 1
Created topic "tp1".
二、服務器Cannot allocate memory
問題:在使用服務器中遇到內(nèi)存無法分配到問題導致啟動不了Kafka
思路一:根據(jù)查找方法,最后將vm.overcommit_memory設為1,但是對我這個問題沒有太大幫助
管理員下三種方案
1.編輯/etc/sysctl.conf ,改vm.overcommit_memory=1,然后sysctl -p使配置文件生效
2.sysctl vm.overcommit_memory=1
3.echo 1 > /proc/sys/vm/overcommit_memory,然后sysctl –p永久生效
思路二:然后通過修改為kafka分配的內(nèi)存,還是沒有解決問題
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
fi
修改為
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"
fi
思路三:最后想到zookeeper已經(jīng)啟動了,是不是因為它分配的內(nèi)存太多了,減少為zookeeper分配的內(nèi)存
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"
fi
三、Offset Explorer連接Kafka問題集合,(Timeout expired while fetching topic metadata),(Uable to find any brokers)
1. Timeout expired while fetching topic metadata
1.Offset Explorer配置好zookeeper的連接地址后
2.在查看Topics的時候,報錯Timeout expired while fetching topic metadata
3.排查發(fā)現(xiàn)應該是kafka的server.properties文件中的advertised.listeners
問題
-
修改前是
advertised.listeners=PLAINTEXT://localhost:9091
-
修改后
advertised.listeners=PLAINTEXT://:9091
4.修改好配置文件后,重啟Kafka即可
2.Uable to find any brokers
1.重新連接后,又遇到Uable to find any brokers
問題,訪問不到的原因是,Offset Explorer訪問Kafka的時候,是因為Kafka tool是通過主機名訪問的,所以要在windows的hosts文件中配置上Kafka服務器的IP映射,配置目錄在C:\Windows\System32\drivers\etc
2.打開C:\Windows\System32\drivers\etc
,修改配置文件hosts
,添加Kafka主機的映射關(guān)系
C:\Windows\System32\drivers\etc\hosts
125.262.96.387 zxy
3.斷開連接后,在Advanced
的Bootstrap servers
處配置好Kafka服務,重新連接即可
4.重新連接后,即可以看到已經(jīng)可以正常看到Topics了
四、kafka數(shù)據(jù)到hudi丟失數(shù)據(jù)問題
1.報錯問題
Caused by: java.lang.IllegalStateException: Cannot fetch offset 196 (GroupId: spark-kafka-source-6f1df211-fdcb-4bcc-813d-55c4f9661c9d-1732697149-executor, TopicPartition: news-0).
Some data may have been lost because they are not available in Kafka any more; either the
data was aged out by Kafka or the topic may have been deleted before all the data in the
topic was processed. If you don't want your streaming query to fail on such cases, set the
source option "failOnDataLoss" to "false".
at org.apache.spark.sql.kafka010.InternalKafkaConsumer$.org$apache$spark$sql$kafka010$InternalKafkaConsumer$$reportDataLoss0(KafkaDataConsumer.scala:642)
at org.apache.spark.sql.kafka010.InternalKafkaConsumer.org$apache$spark$sql$kafka010$InternalKafkaConsumer$$reportDataLoss(KafkaDataConsumer.scala:448)
at org.apache.spark.sql.kafka010.InternalKafkaConsumer$$anonfun$get$1.apply(KafkaDataConsumer.scala:269)
at org.apache.spark.sql.kafka010.InternalKafkaConsumer$$anonfun$get$1.apply(KafkaDataConsumer.scala:234)
at org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:77)
at org.apache.spark.sql.kafka010.InternalKafkaConsumer.runUninterruptiblyIfPossible(KafkaDataConsumer.scala:209)
at org.apache.spark.sql.kafka010.InternalKafkaConsumer.get(KafkaDataConsumer.scala:234)
-
翻譯結(jié)果
最終應用程序狀態(tài):失敗,exitCode:15,(原因:用戶類引發(fā)異常:org.apache.spark.sql.streaming.StreamingQueryException:由于階段失敗而中止作業(yè):階段2.0中的任務0失敗4次,最近的失敗:階段2.0中的任務0.3丟失(TID 5,hadoop,executor 1):java.lang.IllegalStateException:無法獲取偏移量196(GroupId:spark-kafka-source-e2868915-6d7a-4aef-99a8-3d1c5ef45147-1732697149-executor,主題分區(qū):news-0)。
一些數(shù)據(jù)可能已經(jīng)丟失,因為它們在卡夫卡不再可用;要么是數(shù)據(jù)被卡夫卡過時了,要么是主題在處理完主題中的所有數(shù)據(jù)之前被刪除了。如果您不希望流式查詢在這種情況下失敗,請將源選項“failOnDataLoss”設置為“false”。文章來源:http://www.zghlxwxcb.cn/news/detail-459973.html
2.根據(jù)提示添加配置文件 -> option(“failOnDataLoss”,“false”)
//5.讀取Kafka源數(shù)據(jù)
val df: DataFrame = spark.readStream.format("kafka")
.option("kafka.bootstrap.servers", params.brokerList)
.option("subscribe", params.topic)
.option("startingOffsets", "latest")
.option("kafka.consumer.commit.groupid", "action-log-group01")
.option("failOnDataLoss","false")
.load()
tips:認為添加這個配置不太妥當,但尚未找到適宜的方法
哪位博主知道的,希望可以指點指點文章來源地址http://www.zghlxwxcb.cn/news/detail-459973.html
到了這里,關(guān)于《Kafka系列》Offset Explorer連接Kafka問題集合,Timeout expired while.. topic metadata,Uable to find any brokers的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!