前言
最近發(fā)生兩次游戲服務(wù)器進程突然消失的事件,查詢?nèi)罩旧舷挛臎]有找到有用的信息,日志顯示運行到某處戛然而止,此處代碼邏輯簡單,排除異常邏輯導(dǎo)致的服務(wù)器進程崩潰,所以聯(lián)想到之前同一臺服務(wù)器上部署多個進程,因占用內(nèi)存太大被內(nèi)核殺死的情況,猜測本次也是這個原因。
查詢方法
通過搜索引擎找到了兩種查詢方法,一種是直接過濾系統(tǒng)日志,一種是借助 dmesg
命令,備注以下命令在 CentOS 7
上測試通過
通過系統(tǒng)日志查找
[root@demo]# grep "Out of memory" /var/log/messages
Apr 4 10:32:30 hk-dev kernel: Out of memory: Kill process 2434 (Game9) score 212 or sacrifice child
Apr 4 10:33:53 hk-dev kernel: Out of memory: Kill process 2476 (git) score 381 or sacrifice child
Apr 4 10:33:53 hk-dev kernel: Out of memory: Kill process 2777 (git) score 381 or sacrifice child
通過dmesg命令查找
[root@demo]# dmesg -T | grep "Out of memory"
[Mon Aug 30 12:06:25 2021] Out of memory: Kill process 22437 (git) score 400 or sacrifice child
[Wed Sep 22 20:23:52 2021] Out of memory: Kill process 29780 (Game6) score 161 or sacrifice child
[Wed Mar 29 15:54:31 2023] Out of memory: Kill process 29093 (git) score 388 or sacrifice child
[Tue Apr 4 10:24:05 2023] Out of memory: Kill process 2434 (Game9) score 212 or sacrifice child
[Tue Apr 4 10:25:29 2023] Out of memory: Kill process 2476 (git) score 381 or sacrifice child
進程被殺的原因
Linux 內(nèi)核有個機制叫OOM killer,全稱為 Out Of Memory killer,很形象的一個名字——內(nèi)存溢出殺手,這個機制會監(jiān)控那些占用內(nèi)存過大,尤其是瞬間占用內(nèi)存很快的進程,為防止機器內(nèi)存耗盡而主動把該進程殺掉。
當(dāng)內(nèi)核檢測到系統(tǒng)內(nèi)存不足、挑選并殺掉某個進程的過程可以參考內(nèi)核源代碼 linux/mm/oom_kill.c
(2023-4-4 23:24:07確認(rèn)了此文件存在),當(dāng)系統(tǒng)內(nèi)存不足的時候,out_of_memory()
函數(shù)被觸發(fā),然后調(diào)用 select_bad_process()
函數(shù)選擇一個進程殺掉,這個選擇的過程是通過調(diào)用 oom_badness()
函數(shù)實現(xiàn)的,挑選的算法和想法都暴力但樸實:就是找到最占用內(nèi)存的進程。
源碼如下:文章來源:http://www.zghlxwxcb.cn/news/detail-691723.html
/**
* oom_badness - heuristic function to determine which candidate task to kill
* @p: task struct of which task we should calculate
* @totalpages: total present RAM allowed for page allocation
*
* The heuristic for determining which task to kill is made to be as simple and
* predictable as possible. The goal is to return the highest value for the
* task consuming the most memory to avoid subsequent oom failures.
*/
long oom_badness(struct task_struct *p, unsigned long totalpages)
{
long points;
long adj;
if (oom_unkillable_task(p))
return LONG_MIN;
p = find_lock_task_mm(p);
if (!p)
return LONG_MIN;
/*
* Do not even consider tasks which are explicitly marked oom
* unkillable or have been already oom reaped or the are in
* the middle of vfork
*/
adj = (long)p->signal->oom_score_adj;
if (adj == OOM_SCORE_ADJ_MIN ||
test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
in_vfork(p)) {
task_unlock(p);
return LONG_MIN;
}
/*
* The baseline for the badness score is the proportion of RAM that each
* task's rss, pagetable and swap space use.
*/
points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
mm_pgtables_bytes(p->mm) / PAGE_SIZE;
task_unlock(p);
/* Normalize to oom_score_adj units */
adj *= totalpages / 1000;
points += adj;
return points;
}
總結(jié)
- 如果你發(fā)現(xiàn)運行了一段時間的進程突然不見了,那可能是內(nèi)核嫉妒生恨把它給干掉了
- 查詢內(nèi)存溢出被殺掉的進程可以直接通過系統(tǒng)日志來查
grep "Out of memory" /var/log/messages
- 也可以通過專門的命令查找
dmesg -T | grep "Out of memory"
- 剛剛看了linus 的
linux
代碼庫,昨天還在提交,真的是更新無止境
做人一輩子 快樂沒幾天
一條大路分兩邊 隨你要走哪一邊
不怕不怕就不怕 我是年輕人
風(fēng)大雨大太陽大 我就是敢打拼文章來源地址http://www.zghlxwxcb.cn/news/detail-691723.html
到了這里,關(guān)于linux環(huán)境下查看因內(nèi)存占用過大被殺掉的進程的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!