前言
最近我在生產(chǎn)又遇到一個(gè)問(wèn)題,就是消費(fèi)著一段時(shí)間之后,忽然就不再消費(fèi)了,但也不報(bào)錯(cuò)。觀察了幾次,我發(fā)現(xiàn)時(shí)間基本是停留在上下班高峰期數(shù)據(jù)量最大的時(shí)候。我主觀猜測(cè)可能是同時(shí)間進(jìn)來(lái)的數(shù)據(jù)過(guò)多,處理不來(lái)導(dǎo)致的。但這個(gè)問(wèn)題我還沒(méi)來(lái)的及思考怎么處理,因此我選擇多加幾個(gè)并行度先解決一下。故事來(lái)了,就是增加并行度之后神奇的故事就來(lái)了,F(xiàn)link 啟動(dòng)就報(bào)錯(cuò),但exception沒(méi)提示。并且只能看到其中一個(gè)task failure
一、排查
一開(kāi)始,我就選擇修改回原來(lái)的并行度,這樣就不報(bào)錯(cuò)了??墒俏乙幌?,這沒(méi)用呀,沒(méi)能解決我的問(wèn)題。于是我只能再找找看有沒(méi)有報(bào)錯(cuò)可以看,于是我嘗試找到寫(xiě)著falure的那一個(gè)
然后過(guò)去查看這個(gè)taskmanager的日志
因?yàn)樵诩耗J讲豢赡苤挥幸粋€(gè)taskmanager,如果你只有一個(gè)manager,那可以直接進(jìn)入task manager日志下載下來(lái)看就完事了。
,我拿到日志后下載下來(lái),搜索exception
SlidingEventTimeWindows(86400000, 3600000), EventTimeTrigger, CountAverageFunction, LogResultWindowFunction) (338/600)#0 (32c211205b71930916d89b21c0be3058) switched from RUNNING to FAILED with failure cause: java.io.IOException: Insufficient number of network buffers: required 2, but only 0 available. The total number of network buffers is currently set to 131072 of 32768 bytes each. You can increase this number by setting the configuration keys 'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and 'taskmanager.memory.network.max'.
at org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.tryRedistributeBuffers(NetworkBufferPool.java:457)
at org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestMemorySegments(NetworkBufferPool.java:187)
at org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestMemorySegments(NetworkBufferPool.java:60)
at org.apache.flink.runtime.io.network.partition.consumer.BufferManager.requestExclusiveBuffers(BufferManager.java:142)
at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.setup(RemoteInputChannel.java:160)
at org.apache.flink.runtime.io.network.partition.consumer.RemoteRecoveredInputChannel.toInputChannelInternal(RemoteRecoveredInputChannel.java:77)
at org.apache.flink.runtime.io.network.partition.consumer.RecoveredInputChannel.toInputChannel(RecoveredInputChannel.java:106)
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.convertRecoveredInputChannels(SingleInputGate.java:315)
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:298)
at org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:127)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.lang.Thread.run(Thread.java:748)
呵呵,答案就寫(xiě)在報(bào)錯(cuò)上文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-600238.html
二、解決
那還等什么,在yml配置中加大這三個(gè)參數(shù)的設(shè)置不就完了。文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-600238.html
到了這里,關(guān)于Flink 啟動(dòng)就報(bào)錯(cuò),但exception沒(méi)提示。其中一個(gè)task failure 該怎么辦?的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!