前因背景
當(dāng)系統(tǒng)服務(wù)的MySQL錯(cuò)誤日志中,發(fā)現(xiàn)大量以下類似信息:經(jīng)常收到客戶關(guān)于通信故障錯(cuò)誤的問(wèn)題—客戶面臨間歇性的”Got an error reading communication packet”錯(cuò)誤,這里分析這個(gè)錯(cuò)誤出現(xiàn)的原因,以及如何解決這個(gè)問(wèn)題。
Aborted connection 1055898 to db: 'xxx' user: 'yyy' host: 'xxx.xxx.xxx.xxx' (Got timeout reading communication packets)
官方解釋
下面看看官網(wǎng)怎么說(shuō):
Aborted_connects:
If a client is unable even to connect, the server increments the Aborted_connects status variable.
A client attempts to access a database but has no privileges for it.
#客戶端沒(méi)有權(quán)限但是嘗試訪問(wèn)MySQL數(shù)據(jù)庫(kù)
A client uses an incorrect password.
#客戶端輸入的密碼有誤。
A connection packet does not contain the right information.
#連接包不包含正確信息
takes more than connect_timeout seconds to obtain a connect packet.
#超過(guò)連接時(shí)間限制,主要是這個(gè)系統(tǒng)變量connect_timeout控制(mysql默認(rèn)是10s,基本上,除非網(wǎng)絡(luò)環(huán)境極端不好,一般不會(huì)超時(shí)。)
Aborted_clients:
lIf a client successfully connects but later disconnects improperly or is terminated, the server increments the Aborted_clients status variable
The client program did not call mysql_close() before exiting…
#客戶端沒(méi)有進(jìn)行關(guān)閉
The client had been sleeping more than wait_timeout or interactive_timeout seconds without issuing any requests to the server.
#客戶端睡眠時(shí)間超過(guò)了wait_timeout或interactive_timeout秒,而沒(méi)有向服務(wù)器發(fā)出任何請(qǐng)求。
The client program ended abruptly in the middle of a data transfer.
#客戶端程序在數(shù)據(jù)傳輸過(guò)程中突然終止。
Aborted_connects OR Aborted_clients:
Other reasons for problems with aborted connections or aborted clients:
the max_allowed_packet variable value is too small or queries require more memory than you have allocated for mysqld
#max_allow_packet設(shè)置過(guò)小
Use of Ethernet protocol with Linux, both half and full duplex. Some Linux Ethernet drivers have this bug
#Linux以太網(wǎng)驅(qū)動(dòng)程序有這個(gè)bug
A problem with the thread library that causes interrupts on reads.#線程庫(kù)中導(dǎo)致讀取中斷的問(wèn)題。
Badly configured TCP/IP. #tcp/iip 配置信息混亂
Faulty Ethernets, hubs, switches, cables, and so forth. This can be diagnosed properly only by replacing hardware.
#故障的以太網(wǎng)、集線器、交換機(jī)、電纜等等
https://dev.mysql.com/doc/refman/5.7/en/communication-errors.html
Aborted connection情況下,這也意味著以下幾個(gè)問(wèn)題:
- 客戶端正常連接,但是被異常結(jié)束(可能是程序沒(méi)有正常關(guān)閉連接)
- 客戶端sleep的時(shí)間超過(guò)了wait_timeout、或interactive_timeout的值(這會(huì)導(dǎo)致連接被mysql強(qiáng)制關(guān)閉)
- 客戶端異常終端,或者查詢超出max_allowed_packet的值
臨時(shí)配置解決辦法
設(shè)置最大包大小
set global max_allowed_packet = 1024*1024*1024;
查看包大小
mysql> show variables like '%max_allowed_packet%';
+--------------------------+------------+
| Variable_name | Value |
+--------------------------+------------+
| max_allowed_packet | 16777216 |
| slave_max_allowed_packet | 1073741824 |
+--------------------------+------------+
永久配置解決辦法
[mysqld]
max_allowed_packet=256M
當(dāng)然,也可能是其它原因?qū)е碌?。坦白講,異常中斷是很難診斷的,也有可能是和網(wǎng)絡(luò)、防火墻有關(guān)。
從以下幾個(gè)方面考慮:
- 如果有大量的連接進(jìn)程處于sleep狀態(tài)時(shí)間較長(zhǎng),也就意味著應(yīng)用沒(méi)有正確、及時(shí)關(guān)閉數(shù)據(jù)庫(kù)連接。強(qiáng)烈建議在應(yīng)用中能恰當(dāng)?shù)仃P(guān)閉數(shù)據(jù)庫(kù)連接,否則就需要依賴mysql的wait_timeout的設(shè)置來(lái)關(guān)閉連接了。
- 建議檢查max_allowed_packet的值,確保該值設(shè)置的合理,這樣客戶端就不會(huì)接收到"packet too large"消息提示。如果設(shè)置不合理,會(huì)異常中斷連接。
- 建議關(guān)注線程的time_wait數(shù)量。如果netstat發(fā)現(xiàn)有大量的連接處于time_wait狀態(tài),表示該建議應(yīng)用端調(diào)整連接關(guān)閉問(wèn)題了。
TIME_WAIT的處理方案
netstat -ano|grep TIME_WAIT
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55586 TIME_WAIT timewait (32.97/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55367 TIME_WAIT timewait (27.82/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55776 TIME_WAIT timewait (37.09/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:56505 TIME_WAIT timewait (54.61/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55553 TIME_WAIT timewait (31.94/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:56643 TIME_WAIT timewait (57.73/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55221 TIME_WAIT timewait (23.70/0/0)
tcp 0 0 xxx.xxx.xxx.xxx:10054 xxx.xxx.xxx.xxx:55920 TIME_WAIT timewait (41.18/0/0)
網(wǎng)絡(luò)問(wèn)題分析
-
檢查DNS配置是否有延遲問(wèn)題。
- 檢查是否同時(shí)配置了skip_name_resolve,且使用IP驗(yàn)證主機(jī)而不是使用主機(jī)名。設(shè)置該參數(shù)后,使用ip驗(yàn)證主機(jī),而不是使用主機(jī)名。使用該參數(shù)后,mysql授權(quán)表中的host列必須是IP地址或者localhost。
-
增加net_read_timeout、net_write_timeout的值,并觀察是否還有該錯(cuò)誤發(fā)生;
- net_read_timeout很少會(huì)導(dǎo)致出錯(cuò),除非網(wǎng)絡(luò)環(huán)境非常差。
mysql的參數(shù)設(shè)置:
mysql> show variables like '%timeout%';
+-----------------------------+----------+
| Variable_name | Value |
+-----------------------------+----------+
| connect_timeout | 10 |
| interactive_timeout | 1800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| wait_timeout | 1800 |
+-----------------------------+----------+
- 連接異常中斷是因?yàn)檫B接沒(méi)有被正常關(guān)閉。
- server端不會(huì)導(dǎo)致連接abort,除非客戶端/服務(wù)器端發(fā)生了網(wǎng)絡(luò)問(wèn)題。而不是server端的問(wèn)題。
tcpdump,netstat -s
log_warnings掩耳盜鈴
mysql> show global variables like '%log_warning%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_warnings | 2 |
+---------------+-------+
1 row in set (0.00 sec)
如果log_warnings的值大于1,mysql會(huì)將類似信息寫(xiě)入錯(cuò)誤日志:
[Warning] Aborted connection 305628 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets)
[Warning] Aborted connection 305627 to db: 'db' user: 'dbuser' host: 'hostname' (Got an error reading communication packets)
可以修改一下log_waring的值:文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-726317.html
set global log_warnings=1;
但這樣直接修改,重啟后會(huì)失效,修改配置文件mysql.cnf log_warnings = 1文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-726317.html
到了這里,關(guān)于【MySQL技術(shù)專題】「問(wèn)題實(shí)戰(zhàn)系列」MySQL報(bào)錯(cuò)Got an error reading communication packets問(wèn)題分析指南的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!