com.alibaba.nacos.api.exception.NacosException: Request nacos server failed:)
子服務(wù)異常信息:
[2023-04-05 19:33:09.205] [ERROR] [com.alibaba.nacos.client.remote.worker] c.a.n.c.r.c.g.GrpcClient.printIfErrorEnabled [LoggerUtils.java:99
] - Server check fail, please check server **.*.*.*** ,port 9848 is available , error ={}
java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 133573 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.Cl
ientCalls$GrpcFuture@68226c97[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMet
hodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io
.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@7add5b19, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUt
ils$MessageMarshaller@a1c659e, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@2517bae6}}}}}]]
at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:508)
at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.serverCheck(GrpcClient.java:192)
at com.alibaba.nacos.common.remote.client.grpc.GrpcClient.connectToServer(GrpcClient.java:302)
at com.alibaba.nacos.common.remote.client.RpcClient.reconnect(RpcClient.java:521)
at com.alibaba.nacos.common.remote.client.RpcClient.lambda$start$2(RpcClient.java:369)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2023-04-05 19:33:09.206] [INFO ] [com.alibaba.nacos.client.remote.worker] c.a.n.c.r.client.printIfInfoEnabled [LoggerUtils.java:60] - [d6f69698-6284-4758-9812-163aea9a1cd6] Fail to connect server, after trying 5 times, last try server is {serverIp = '**.*.*.***', server main port = 8848}, error = unknown
[2023-04-05 19:33:10.828] [ERROR] [com.alibaba.nacos.client.naming.updater.1] c.a.n.c.r.client.printIfErrorEnabled [LoggerUtils.java:99] - Send request fail, request = ServiceQueryRequest{headers={app=unknown}, requestId='null'}, retryTimes = 0, errorMessage = Client not connected, current status:UNHEALTHY
[2023-04-05 19:33:10.928] [ERROR] [com.alibaba.nacos.client.naming.updater.1] c.a.n.c.r.client.printIfErrorEnabled [LoggerUtils.java:99] - Send request fail, request = ServiceQueryRequest{headers={app=unknown}, requestId='null'}, retryTimes = 1, errorMessage = Client not connected, current status:UNHEALTHY
[2023-04-05 19:33:11.029] [ERROR] [com.alibaba.nacos.client.naming.updater.1] c.a.n.c.r.client.printIfErrorEnabled [LoggerUtils.java:99] - Send request fail, request = ServiceQueryRequest{headers={app=unknown}, requestId='null'}, retryTimes = 2, errorMessage = Client not connected, current status:UNHEALTHY
[2023-04-05 19:33:11.029] [WARN ] [com.alibaba.nacos.client.naming.updater.1] c.a.n.client.naming.run [ServiceInfoUpdateService.java:198] - [NA] failed to update serviceName: UAT_GROUP@@dfcg-contract
com.alibaba.nacos.api.exception.NacosException: Request nacos server failed:
[NA] failed to update serviceName: UAT_GROUP@@dfcg-contract
com.alibaba.nacos.api.exception.NacosException: Request nacos server failed:
at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.requestToServer(NamingGrpcClientProxy.java:288)
at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.queryInstancesOfService(NamingGrpcClientProxy.java:168)
at com.alibaba.nacos.client.naming.remote.NamingClientProxyDelegate.queryInstancesOfService(NamingClientProxyDelegate.java:110)
at com.alibaba.nacos.client.naming.core.ServiceInfoUpdateService$UpdateTask.run(ServiceInfoUpdateService.java:185)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.alibaba.nacos.api.exception.NacosException: Client not connected, current status:UNHEALTHY
at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:651)
at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:631)
at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.requestToServer(NamingGrpcClientProxy.java:278)
... 10 common frames omitted
nacos服務(wù)異常信息:
[devuser@dev-cg-platform logs]$ grep 'Thread starvation or clock leap detected' nacos.log.2023-04-05.0
2023-04-05 19:33:42,633 WARN HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m8s776ms847μs171ns).
[devuser@dev-cg-platform logs]$ grep 'Thread starvation or clock leap detected' nacos.log.2023-04-04.0
2023-04-04 06:38:02,534 WARN HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m46s556ms115μs835ns).
[devuser@dev-cg-platform logs]$ grep 'Thread starvation or clock leap detected' nacos.log.2023-04-03.0
2023-04-03 13:00:45,036 WARN HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=45s336ms970μs889ns).
[devuser@dev-cg-platform logs]$ grep 'Thread starvation or clock leap detected' nacos.log.2023-04-02.0
2023-04-02 23:30:59,155 WARN HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m20s824ms191μs759ns).
[devuser@dev-cg-platform logs]$ grep 'Thread starvation or clock leap detected' nacos.log.2023-04-01.0
2023-04-01 20:07:07,371 WARN HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m12s612ms153μs480ns).
起初一天的異常信息沒留意,然后連續(xù)查了幾天發(fā)現(xiàn)都有一個(gè)共同點(diǎn),每當(dāng)nacos日志出現(xiàn) Thread starvation or clock leap detected
時(shí),子服務(wù)就會(huì)出現(xiàn)服務(wù)連接超時(shí)
出現(xiàn)問題的原因:
Nacos 是一個(gè)基于服務(wù)發(fā)現(xiàn)和配置管理的平臺(tái),它會(huì)定期進(jìn)行一些任務(wù),如清理無效配置等,這些任務(wù)通常是由一個(gè)叫做“housekeeper”的線程執(zhí)行的。當(dāng)該線程因?yàn)槟承┰虮蛔枞蛘弑婚L時(shí)間占用時(shí),就可能會(huì)出現(xiàn) Thread starvation 或者 clock leap detected 的問題,導(dǎo)致 Nacos 的運(yùn)行出現(xiàn)異常。
解決方案:
增加 housekeeper 線程數(shù):可以通過修改 Nacos 的配置文件(如 application.properties)來增加 housekeeper 線程數(shù),讓它們能夠更快地完成任務(wù),從而減少阻塞時(shí)間。
調(diào)整 housekeeper 的執(zhí)行間隔:可以根據(jù)實(shí)際情況調(diào)整 housekeeper 的執(zhí)行間隔,讓它們能夠更加高效地執(zhí)行任務(wù),從而減少占用時(shí)間。
檢查系統(tǒng)資源:如果系統(tǒng)資源(如 CPU、內(nèi)存等)不足,也會(huì)導(dǎo)致 housekeeper 線程被阻塞或者占用過長時(shí)間。因此,可以通過監(jiān)控系統(tǒng)資源使用情況,及時(shí)增加資源,避免出現(xiàn)這種問題。
升級(jí) Nacos 版本:Nacos 的新版本通常會(huì)修復(fù)一些已知問題,因此,如果出現(xiàn) Thread starvation 或者 clock leap detected 的問題,可以嘗試升級(jí) Nacos 版本,看看是否能夠解決問題。
總之,避免 Thread starvation 或者 clock leap detected 的關(guān)鍵是確保 housekeeper 線程能夠高效地執(zhí)行任務(wù),同時(shí)保證系統(tǒng)資源充足
配置housekeeper
在 Nacos 中,housekeeper 是用來執(zhí)行定期任務(wù)的線程,如刪除過期的配置信息、清理無效的緩存等。默認(rèn)情況下,Nacos 中只有一個(gè) housekeeper 線程,如果系統(tǒng)中的配置信息比較多,這個(gè)線程可能會(huì)出現(xiàn)阻塞,導(dǎo)致 Nacos 的性能下降。
為了解決這個(gè)問題,可以通過修改 Nacos 的配置文件(如 application.properties)來配置 housekeeper 線程的數(shù)量和執(zhí)行時(shí)間間隔。具體操作如下:
配置 housekeeper 線程數(shù)量:可以通過配置項(xiàng) nacos.core.thread.count 來指定 housekeeper 線程的數(shù)量。例如,將其設(shè)置為 10,則會(huì)啟動(dòng) 10 個(gè) housekeeper 線程來執(zhí)行定期任務(wù)。默認(rèn)情況下,該值為 1,也就是只有一個(gè) housekeeper 線程。
# 設(shè)置 housekeeper 線程數(shù)量
nacos.core.thread.count=10
配置 housekeeper 執(zhí)行時(shí)間間隔:
可以通過配置項(xiàng) nacos.core.task.maxTime 來指定 housekeeper 執(zhí)行任務(wù)的時(shí)間間隔。例如,將其設(shè)置為 30 分鐘,則每隔 30 分鐘會(huì)執(zhí)行一次定期任務(wù)。默認(rèn)情況下,該值為 15 分鐘。文章來源:http://www.zghlxwxcb.cn/news/detail-539480.html
# 設(shè)置 housekeeper 執(zhí)行時(shí)間間隔
nacos.core.task.maxTime=30m
除了以上兩個(gè)配置項(xiàng)外,還可以通過其他配置項(xiàng)來調(diào)整 housekeeper 的行為,例如:
nacos.cmdb.dumpTaskTimeInterval:dump 定時(shí)任務(wù)執(zhí)行時(shí)間間隔,用于打印出定時(shí)任務(wù)執(zhí)行的時(shí)間情況。
nacos.core.task.check.maxDelta:檢查系統(tǒng)時(shí)間與 NTP 時(shí)間的最大差距,用于檢測系統(tǒng)時(shí)間是否被篡改。
nacos.core.task.delete.task.interval:刪除任務(wù)執(zhí)行的時(shí)間間隔。
總之,配置 housekeeper 可以幫助優(yōu)化 Nacos 的性能和穩(wěn)定性,避免出現(xiàn)線程阻塞和性能下降等問題。文章來源地址http://www.zghlxwxcb.cn/news/detail-539480.html
nacos.cmdb.dumpTaskTimeInterval 配置
# 轉(zhuǎn)儲(chǔ)外部CMDB的間隔(以秒為單位):
nacos.cmdb.dumpTaskInterval=3600
nacos.core.task.check.maxDelta 配置
# 設(shè)置檢查系統(tǒng)時(shí)間與 NTP 時(shí)間的最大差距
nacos.core.task.check.maxDelta=600000
nacos.core.task.delete.task.interval配置
# 設(shè)置定期清理過期數(shù)據(jù)的時(shí)間間隔
nacos.core.task.delete.task.interval=2h
到了這里,關(guān)于連接nacos異常[NA] failed to update serviceName: UAT_GROUP@@**-** NacosException: Request Exception的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!