手機進入Recovery之 RescueParty
系統(tǒng)版本:Android 12
1. 案例一:三方內(nèi)置應用頻繁Crash導致Recovery
1.1根據(jù)現(xiàn)有的log來看:
本次手機重啟進入 Recovery
是由于 “com.rjio.slc” 應用頻繁發(fā)生奔潰導致.
1.2分析線索如下:
W/RescueParty( 1090): Attempting rescue level FACTORY_RESET
D/PackageManager( 1090): Finished rescue level FACTORY_RESET for package com.rjio.slc //導致將會進入FACTORY_RESET模式,將使得手機重啟,并讓用戶選擇是否擦除用戶數(shù)據(jù)
I/uncrypt ( 8502): received command: [--prompt_and_wipe_data
I/uncrypt ( 8502): --reason=RescueParty
I/uncrypt ( 8502): --locale=en_IN
I/uncrypt ( 8502): ] (59)
W/uncrypt ( 8502): [libfs_mgr]Warning: unknown flag: optional
D/AAL ( 862): 04-27 01:50:49.329 BL= 853,ESS= 256, 04-27 01:50:49.340 BL= 852,ESS= 256,
I/uncrypt ( 8502): received 0, exiting now
I/RecoverySystemService( 1090): uncrypt setup bcb successfully finished.
V/ShutdownCheckPoints( 1090): Binder shutdown checkpoint recorded with pid=1090
I/MtkSystemServiceFactoryImpl( 1090): Start : MTK Shutdown Thread
V/ShutdownCheckPoints( 1090): System server shutdown checkpoint recorded
//該應用 發(fā)生異常 信息如下
E/AndroidRuntime( 8507): FATAL EXCEPTION: Thread-2
E/AndroidRuntime( 8507): Process: com.rjio.slc, PID: 8507
E/AndroidRuntime( 8507): java.lang.NullPointerException: Attempt to invoke virtual method 'int java.lang.String.length()' on a null object reference
E/AndroidRuntime( 8507): at com.rjio.slc.util.CommonUtility.getSalt(Unknown Source:8)
E/AndroidRuntime( 8507): at com.rjio.slc.util.CommonUtility.getHash(Unknown Source:17)
E/AndroidRuntime( 8507): at com.rjio.slc.util.CommonUtility.getDeviceInfoJsonParams(Unknown Source:96)
E/AndroidRuntime( 8507): at com.rjio.slc.util.CommonUtility.sendDeviceInfo(Unknown Source:2)
E/AndroidRuntime( 8507): at com.rjio.slc.util.NetworkCallThread$InternalThread.run(Unknown Source:228)
I/am_crash( 1090): [8507,0,com.rjio.slc,545799757,java.lang.NullPointerException,Attempt to invoke virtual method 'int java.lang.String.length()' on a null object reference,NULL,8]
W/ActivityManager( 1090): Process com.rjio.slc has crashed too many times, killing! Reason: crashed quickly
I/am_process_crashed_too_much( 1090): [0,com.rjio.slc,1000]
I/sysui_multi_action( 1090): [757,316,758,4,759,-1]
I/ShutdownThread( 1090): Shutting down activity manager...
從有限的日志來看確實由于三方應用頻繁crash導致系統(tǒng)重啟.
1.3進入重啟的函數(shù)調(diào)用棧如下:
RescuePartyObserver.execute(…)
—executeRescueLevel(…) //對應log:Slog.w(TAG, "Attempting rescue level " + levelToString(level));
------executeRescueLevelInternal()
---------case LEVEL_FACTORY_RESET:
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);//這將會導致重啟
------------RecoverySystem.rebootPromptAndWipeUserData(…)
--------------- bootCommand(…)// 對應log:–prompt_and_wipe_data
------------------rs.rebootRecoveryWithCommand(command.toString())//這將調(diào)用對應服務重啟系統(tǒng)
下面我們開始進行探究.
2.應用頻繁Crash是如何導致系統(tǒng)重啟探究
2.1問題: 我們的系統(tǒng)是如何檢測到應用頻繁發(fā)生crash,并且觸發(fā)重啟機制的?
主要的類為 RescuePartyObserver
類,該類在 RescueParty
中實現(xiàn) PackageHealthObserver
接口.該類的將會注冊到Package的WatchDog
中,用來接收package的失敗事件,以及做出相應處理.如何處理?(TODO)
2.1.1RescuePartyObserver類
PackageHealthObserver
接口如下:
/** Register instances of this interface to receive notifications on package failure. */
public interface PackageHealthObserver {
/**
* 當 {@code versionedPackage} 的運行狀況檢查失敗時調(diào)用。
*/
@PackageHealthObserverImpact int onHealthCheckFailed(
@Nullable VersionedPackage versionedPackage,
@FailureReasons int failureReason,
int mitigationCount);
/**
* 執(zhí)行 {@link #onHealthCheckFailed} 的緩解措施。
*/
boolean execute(@Nullable VersionedPackage versionedPackage,
@FailureReasons int failureReason, int mitigationCount);
/**
*當系統(tǒng)服務器在由 {@link #mBootThreshold} 定義的時間窗口內(nèi)引導多次時調(diào)用
*/
default @PackageHealthObserverImpact int onBootLoop(int mitigationCount) {
return PackageHealthObserverImpact.USER_IMPACT_NONE;
}
/**
* 執(zhí)行 {@link #onBootLoop} 的緩解措施
*/
default boolean executeBootLoopMitigation(int mitigationCount) {
return false;
/**
* 觀察器的標識符不應在設(shè)備更新之間更改,否則看門狗可能會丟棄具有舊名稱的觀測包。
*/
String getName();
/**
* 如果設(shè)置了此項,則不會修剪觀察器,即使觀察器未顯式監(jiān)視任何包也是如此。
*/
default boolean isPersistent() {
return false;
}
/**
* 如果此觀察者希望觀察給定的包,則返回 {@code true},否則返回 {@code false}
*/
default boolean mayObservePackage(String packageName) {
return false;
}
}
2.1.2啟動RescuePartyObserver流程:
在 SystemServer
啟動的startBootstrapServices 階段 將完成 RescueParty.registerHealthObserver(mSystemContext)
;
//frameworks/base/services/core/java/com/android/server/RescueParty.java
public static void registerHealthObserver(Context context) {
PackageWatchdog.getInstance(context).registerHealthObserver(
RescuePartyObserver.getInstance(context));
}
PackageWatchdog
類的作用主要是監(jiān)測系統(tǒng)中各個包的運行情況
將 RescuePartyObserver
對象存放到該類中的mAllObservers成員中,可見 PackageWatchdog
類 該成員中保存著不止 RescuePartyObserver
這一種Oberver,既然是Observer,那么后面有某些意外情況會回調(diào)它.
//frameworks/base/services/core/java/com/android/server/PackageWatchdog.java
public void registerHealthObserver(PackageHealthObserver observer) {
synchronized (mLock) {
ObserverInternal internalObserver = mAllObservers.get(observer.getName());
if (internalObserver != null) {
internalObserver.registeredObserver = observer;
} else {
internalObserver = new ObserverInternal(observer.getName(), new ArrayList<>());
internalObserver.registeredObserver = observer;
mAllObservers.put(observer.getName(), internalObserver);
syncState("added new observer");
}
}
}
在 PackageWatchdog
中主要以下方法會調(diào)用到這些 observer
出處理包運行出錯的情況
.
1.onPackageFailure //當進程由于崩潰、ANR 或顯式運行狀況檢查而失敗時調(diào)用。
2.handleFailureImmediately //對于本機崩潰或明顯的運行狀況檢查失敗,請直接調(diào)用每個觀察器以緩解錯誤,而無需經(jīng)過故障閾值邏輯。
3.noteBoot //在systemserver啟動時調(diào)用。如果檢測到Systemserver于 boot loop中,請查詢每個觀察者,并以對用戶的影響最小執(zhí)行緩解操作。
4.writeNow // 這目前為關(guān)機線程增加了大約7ms的額外空間 在關(guān)機期間將包信息寫入文件。調(diào)用saveToFile函數(shù)將所有observer保持到文件中
5.onHealthCheckPassed //運行情況檢查沒有問題將調(diào)用到這
6.onSupportedPackages //在啟動期間調(diào)用,以在設(shè)備上package準備就緒時通知,以便我們可以開始綁定。TODO 如何檢查?
7.getPackagesPendingHealthChecksLocked
8.getNextStateSyncMillisLocked // 返回下一個持續(xù)時間(以毫秒為單位)以同步監(jiān)視程序狀態(tài)。
9.pruneObserversLocked // 從受監(jiān)視包上的所有持續(xù)時間中刪除 {@code elapsedMs,已用掉} 毫秒
10.loadFromFile
11.saveToFile //將所有observer保持到文件中/data/system/package-watchdog.xml中
12.dump
2.1.2onPackageFailure()方法
2.1.2.1onPackageFailure()方法用途1
當 ActivityManagerService
的啟動流程走到 onBootPhase(int phase)
的 PHASE_THIRD_PARTY_APPS_CAN_START
階段,將會調(diào)用mService.mPackageWatchdog.onPackagesReady()
進而調(diào)用 registerConnectivityModuleHealthListener()
函數(shù)
該函數(shù)內(nèi)容如下:
private void registerConnectivityModuleHealthListener() {
// TODO: have an internal method to trigger a rollback by reporting high severity errors,
// and rely on ActivityManager to inform the watchdog of severe network stack crashes
// instead of having this listener in parallel.
mConnectivityModuleConnector.registerHealthListener(
packageName -> {
final VersionedPackage pkg = getVersionedPackage(packageName);
if (pkg == null) {
Slog.wtf(TAG, "NetworkStack failed but could not find its package");
return;
}
final List<VersionedPackage> pkgList = Collections.singletonList(pkg);
onPackageFailure(pkgList, FAILURE_REASON_EXPLICIT_HEALTH_CHECK);
});
}
將該 onPackageFailure
函數(shù)作為 ConnectivityModuleHealthListener
的一部分注冊到 ConnectivityModuleConnector
中,該類和網(wǎng)絡相關(guān),注冊的目的應該是:和網(wǎng)絡相關(guān)包運行出現(xiàn)問題將會回調(diào)到 onPackageFailure
函數(shù),從而進行相應的處理.
網(wǎng)絡相關(guān)的我們暫且忽略.
2.1.2.2onPackageFailure()方法用途2
- RollbackPackageHealthObserver
- RollbackManagerServiceImpl
- RollbackManagerService
這幾個個類涉及到回滾機制–Rollback
回滾機制:在Android 10.0中,Google新增加了個功能。如果用戶對新升級的APP不滿意,可以通過“回到過去”,回滾到舊版。當然,如果新安裝的apk出現(xiàn)了各種問題無法使用,也可以進行回滾的操作。–我們暫且略過
調(diào)用到onPackageFailure函數(shù)的流程如下:
SystemServer.startCoreServices)
---mSystemServiceManager.startService(ROLLBACK_MANAGER_SERVICE_CLASS)
------RollbackManagerService.onBootPhase(mCurrentPhase)
---------onBootCompleted()
------------PackageHealthObserver.onBootCompletedAsync()
---------------onBootCompleted()
------------------PackageWatchdog.getInstance(mContext).scheduleCheckAndMitigateNativeCrashes()//進入PackageWatchdog
---------------------mShortTaskHandler.post(()->checkAndMitigateNativeCrashes())//mShortTaskHandler這里的handler應該會將該函數(shù)傳到SystemServer主線程去執(zhí)行
代碼如下:
private void checkAndMitigateNativeCrashes() {
mNumberOfNativeCrashPollsRemaining--;
// Check if native watchdog reported a crash
if ("1".equals(SystemProperties.get("sys.init.updatable_crashing"))) {
// We rollback everything available when crash is unattributable
onPackageFailure(Collections.EMPTY_LIST, FAILURE_REASON_NATIVE_CRASH);
// we stop polling after an attempt to execute rollback, regardless of whether the
// attempt succeeds or not
} else {
if (mNumberOfNativeCrashPollsRemaining > 0) {
mShortTaskHandler.postDelayed(() -> checkAndMitigateNativeCrashes(),
NATIVE_CRASH_POLLING_INTERVAL_MILLIS);
}
}
}
2.2從Thread的線程捕獲異常接口開始探究
2.2.1Thread.UncaughtExceptionHandler接口:
該接口說明如下:
Thread由于未捕獲的異常而突然終止時調(diào)用的處理程序接口。
1.當一個線程要終止由于未捕獲到異常的Java虛擬機將使用查詢線程其UncaughtExceptionHandler?getUncaughtExceptionHandler()
?,將調(diào)用處理程序的uncaughtException
方法,將線程和異常作為參數(shù)。
?2.如果一個線程一直沒有其UncaughtExceptionHandler
明確設(shè)置,那么它ThreadGroup
對象充當其UncaughtExceptionHandler
。?如果ThreadGroup
對象沒有特殊要求處理異常,它可以將調(diào)用轉(zhuǎn)發(fā)給default uncaught exception handler?
Thread
中存在兩個 UncaughtExceptionHandler
-
Thread.UncaughtExceptionHandler
- 一個是非靜態(tài)
uncaughtExceptionHandler
:為單個線程設(shè)置一個屬于線程自己的uncaughtExceptionHandler
,轄范圍比較小
- 一個是非靜態(tài)
-
static?Thread.UncaughtExceptionHandler
- 靜態(tài)的
UncaughtExceptionHandler
。來自所有線程中的Exception
在拋出并且未捕獲的情況下,都會從此路過。進程fork的時候設(shè)置的就是這個靜態(tài)的defaultUncaughtExceptionHandler
,管轄范圍為整個進程。
- 靜態(tài)的
線程組類 ThreadGroup也實現(xiàn)了該接口,因為當如果該線程沒有明確的設(shè)置 UncaughtExceptionHandler
,異常處理將交給 ThreadGroup
進行處理。
我們主要看異常處理方法 uncaughtException
其次,ThreadGroup
可以表示一個線程,也可以表示一個線程組,線程和線程是以樹形結(jié)構(gòu)來進行組織的。
如下:
- 如果該線程組還有父線程組,則先執(zhí)行父線程組的 uncaughtException()
- 如果有設(shè)置默認的
UncaughtExceptionHandler
則調(diào)用其uncaughtException()
- 如果異常不是
ThreadDeath
實例,則將調(diào)用棧打印出來,輸出到System.err
。
//ThreadGroup.java,可以
//這個方法將由JVM回調(diào),當出現(xiàn)異常時
public void uncaughtException(Thread t, Throwable e) {
if (parent != null) {
parent.uncaughtException(t, e);
} else {
Thread.UncaughtExceptionHandler ueh =
Thread.getDefaultUncaughtExceptionHandler();
if (ueh != null) {
ueh.uncaughtException(t, e);
} else if (!(e instanceof ThreadDeath)) {
System.err.print("Exception in thread \""
+ t.getName() + "\" ");
e.printStackTrace(System.err);
}
}
}
不同的進程只要設(shè)置了 UncaughtExceptionHandler
并且實現(xiàn)相關(guān)的 uncaughtException
,那么就有其處理異常的手段.我們可以再源碼中搜索 “implements Thread.UncaughtExceptionHandler”
2.2.2我們看系統(tǒng)是如何處理異常的:
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
兩個異常處理類:
-
LoggingHandler
- 主要功能是系統(tǒng)異常信息記錄:
-
- 如果是系統(tǒng)進程
system_server
異常,則打印:*** FATAL EXCEPTION IN SYSTEM PROCESS:XXX
日志
- 如果是系統(tǒng)進程
-
- 如果是內(nèi)存溢出,打印
OutOfMemoryError IN SYSTEM PROCESS: Already dump hprof!
信息
- 如果是內(nèi)存溢出,打印
-
- 如果其他進行,打印
FATAL EXCEPTION:+進程XXX,以及Pid XXX
等
- 如果其他進行,打印
-
KillApplicationHandler
- 處理導致應用死亡的未捕獲異常
-
- 先調(diào)用
LoggingHandler(..)
來處理 該異常,也就是說流程以及l(fā)og信息打印和前面一樣
- 先調(diào)用
-
- 停止當前進程的trace追蹤
-
- 調(diào)用AMS的
handleApplicationCrash()
函數(shù)和 彈出Crash對話框
- 3.1 判斷是否是
System_server
崩潰 - 3.2 調(diào)用
handleApplicationCrashInner(...)
處理am管理的生命周期清理的解耦- 3.2.1從PMS中獲取該進程,這一步主要的目的是判斷是否是在APP增量升級包中發(fā)生crash,如果發(fā)生,則Slog.e()打印“App crashed on incremental package”這樣的log字樣
- 3.2.2將Crash信息寫入到Event log
EventLogTags.writeAmCrash(...)
- 3.2.3 將錯誤信息添加到DropBox
addErrorToDropBox(...)
是將crash的信息輸出到目錄/data/system/dropbox - 3.2.4 調(diào)用
AppErrors.crashApplication(...)
出來相關(guān)流程- 3.2.4.1
mPackageWatchdog.onPackageFailure()
處理(看如下介紹) - 3.2.4.2
mService.mProcessList.noteAppKill()
處理- 3.2.4.2.1 將該進程放入
mDyingProcesses
中一會殺掉它 - 3.2.4.2.2
mAppExitInfoTracker.scheduleNoteAppKill(..)
函數(shù)發(fā)送KillHandler.MSG_APP_KILL
記錄“殺掉進程”前的一些信息
- 3.2.4.2.1 將該進程放入
- 3.2.4.3 發(fā)送
ActivityManagerService.SHOW_ERROR_UI_MSG
消息:主要彈出彈出提示crash的對話框提供用戶選擇
- 3.2.4.1
- 調(diào)用AMS的
-
- 最終調(diào)用
Process.killProcess(Process.myPid())
殺掉該進程
- 最終調(diào)用
-
- System.exit(10) 退出
代碼流程:
//android/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
- System.exit(10) 退出
private static class LoggingHandler implements Thread.UncaughtExceptionHandler{..}//默認捕獲異常的異常處理類
//處理因未捕獲的異常而導致的應用程序死亡。 framework為主線程捕獲這些線程,因此這只對應用程序創(chuàng)建的線程重要。在此方法運行之前,{@link LoggingHandler} 的給定實例應該已經(jīng)記錄了詳細信息(如果不是,則首先運行)。
private static class KillApplicationHandler implements
Thread.UncaughtExceptionHandler{...}
3.PackageWatchdog.onPackageFailure函數(shù)如何做出反應
3.1處理說明
當通過上面的 3.2.4.1 步驟調(diào)用到 PackageWatchdog.onPackageFailure()
,邏輯如下:
-
-
緊急處理–當
failureReason
是FAILURE_REASON_NATIVE_CRASH
和FAILURE_REASON_EXPLICIT_HEALTH_CHECK
這兩個 Reason將調(diào)用handleFailureImmediately
做立刻處理,
- 原理就是只拿packages中的第一個package,將這個出現(xiàn)問題的package傳給各個 PackageHealthObserver 做處理(一般native cash或者explicit health check failures回走這一步)
-
緊急處理–當
-
- 非緊急處理–將逐個遍歷packages中的包,逐個調(diào)用 PackageHealthObserver 做處理-詳細如下:
- 2.1 找到對用戶影響最小的觀察者(目前有:
RollbackPackageHealthObserver
包回滾機制相關(guān)、RescueParty),首先先要獲取package的緩解計數(shù)(主要統(tǒng)計各個Observer對于package發(fā)生的崩潰等其他異常情況的緩解處理次數(shù),這個緩解計數(shù) 每當Observer對該package做出緩解action后 就會以當前時間作為值存入一個數(shù)組中,緩解計數(shù)則就是,這個當這個緩解次數(shù)超過一定的值之后,也就是說系統(tǒng)對于APK的發(fā)生崩潰的容忍度不能超過一定限度 ,超過限度后,Observers 將會發(fā)出回滾、重啟等其他較為嚴重的處理)到此可先看一下下面的<package異常對用戶影響程度判斷…>再回過來往下看 - 2.2 回過來后,我們此時已經(jīng)得到了相對 對用戶影響程度較小的Observer以及對用戶影響程度值,現(xiàn)在就要使用該Observer對于該package 異常做緩解處理了
- 2.2.1 將package的緩解計數(shù)加一
- 2.2.2獲取最新的緩解計數(shù)做緩解處理 可以先繼續(xù)往下看<緩解處理…>
//frameworks/base/services/core/java/com/android/server/PackageWatchdog.java
public void onPackageFailure(List<VersionedPackage> packages,
@FailureReasons int failureReason) {
//...
mLongTaskHandler.post(() -> {
synchronized (mLock) {
if (mAllObservers.isEmpty()) {
return;
}
boolean requiresImmediateAction = (failureReason == FAILURE_REASON_NATIVE_CRASH
|| failureReason == FAILURE_REASON_EXPLICIT_HEALTH_CHECK);
if (requiresImmediateAction) {
handleFailureImmediately(packages, failureReason);
} else {
for (int pIndex = 0; pIndex < packages.size(); pIndex++) {
VersionedPackage versionedPackage = packages.get(pIndex);
// 遍歷各個發(fā)生異常的package
PackageHealthObserver currentObserverToNotify = null;
int currentObserverImpact = Integer.MAX_VALUE;
MonitoredPackage currentMonitoredPackage = null;
// 找到對用戶影響最小的Observer,
//即各個Observer對于包異常的處理都不一樣,我們要盡量避免出現(xiàn)系統(tǒng)重啟或者恢復出廠設(shè)置這樣的操作
for (int oIndex = 0; oIndex < mAllObservers.size(); oIndex++) {
ObserverInternal observer = mAllObservers.valueAt(oIndex);
PackageHealthObserver registeredObserver = observer.registeredObserver;
if (registeredObserver != null
&& observer.onPackageFailureLocked(
versionedPackage.getPackageName())) {
MonitoredPackage p = observer.getMonitoredPackage(
versionedPackage.getPackageName());
int mitigationCount = 1;
if (p != null) {
//得到package的緩解計數(shù)
mitigationCount = p.getMitigationCountLocked() + 1;
}
//通過Observer通過onHealthCheckFailed()方法根據(jù)mitigationCount緩解計數(shù)返回一個package異常影響程度值
int impact = registeredObserver.onHealthCheckFailed(
versionedPackage, failureReason, mitigationCount);//我可以在下面看一下兩個Observer的操作
if (impact != PackageHealthObserverImpact.USER_IMPACT_NONE
&& impact < currentObserverImpact) {
//這里即是“查找算法”中的常規(guī)操作
currentObserverToNotify = registeredObserver;
currentObserverImpact = impact;
currentMonitoredPackage = p;
}
}
}
// Execute action with least user impact
if (currentObserverToNotify != null) {
int mitigationCount = 1;
if (currentMonitoredPackage != null) {
//將package的緩解計數(shù)加一
currentMonitoredPackage.noteMitigationCallLocked();
//得到最新的package的緩解計數(shù)
mitigationCount =
currentMonitoredPackage.getMitigationCountLocked();
}
//“緩解處理“
currentObserverToNotify.execute(versionedPackage,
failureReason, mitigationCount);
}
}
}
}
});
}
3.2package異常對用戶影響程度判斷PackageHealthObserver.onHealthCheckFailed
兩個Observer的處理:
- RollbackPackageHealthObserver回滾相關(guān)
- RescueParty.RescuePartyObserver
3.2.1. RollbackPackageHealthObserver.java
- 對用戶的影響程度中等:
USER_IMPACT_MEDIUM
- 對于
Native crash
則返回的是PackageHealthObserverImpact.USER_IMPACT_MEDIUM
中等程度的影響
- 對于
- 其他異常,
- 有可用的Rollback:
PackageHealthObserverImpact.USER_IMPACT_MEDIUM
- 無可用的Rollback:
PackageHealthObserverImpact.USER_IMPACT_NONE
回滾機制–暫時略過
- 有可用的Rollback:
@Override
public int onHealthCheckFailed(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason, int mitigationCount) {
// For native crashes, we will roll back any available rollbacks
if (failureReason == PackageWatchdog.FAILURE_REASON_NATIVE_CRASH
&& !mContext.getSystemService(RollbackManager.class)
.getAvailableRollbacks().isEmpty()) {
return PackageHealthObserverImpact.USER_IMPACT_MEDIUM;
}
if (getAvailableRollback(failedPackage) == null) {
// Don't handle the notification, no rollbacks available for the package
return PackageHealthObserverImpact.USER_IMPACT_NONE;
} else {
// Rollback is available, we may get a callback into #execute
return PackageHealthObserverImpact.USER_IMPACT_MEDIUM;
}
}
3.2.2. RescueParty.RescuePartyObserver
- 當原因是
APP CRASH
或者APP 無響應
將調(diào)用mapRescueLevelToUserImpact
處理,通過緊急程度RescueLevel
來判斷對用戶的影響程度UserImpact-
getRescueLevel
通過緩解計數(shù)得到緊急程度值緊急程度值如下- mitigationCount == 1:
LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS
- mitigationCount == 2:
LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES
- mitigationCount == 3:
LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS
- mitigationCount == 4:
LEVEL_WARM_REBOOT
和最大緊急程度對比的最小值 - mitigationCount >= 5:
LEVEL_FACTORY_RESET
和最大緊急程度對比的最小值 - 備注:最大緊急程度將由
persist.device_config.configuration.disable_rescue_party_factory_reset
屬性決定,- 屬性值為true,則最大緊急程度為 LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS
- 屬性值為false或null,則最大緊急程度為 LEVEL_FACTORY_RESET
- mitigationCount == 1:
-
mapRescueLevelToUserImpact
方法將會依據(jù)剛剛的緊急程度得到一個對應用戶影響程度的判斷- 影響程度?。?USER_IMPACT_LOW
LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS
LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES
- 影響程度高:USER_IMPACT_HIGH
LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS
LEVEL_WARM_REBOOT
LEVEL_FACTORY_RESET
- 影響程度?。?USER_IMPACT_LOW
-
@Override
public int onHealthCheckFailed(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason, int mitigationCount) {
if (!isDisabled() && (failureReason == PackageWatchdog.FAILURE_REASON_APP_CRASH
|| failureReason == PackageWatchdog.FAILURE_REASON_APP_NOT_RESPONDING)) {
return mapRescueLevelToUserImpact(getRescueLevel(mitigationCount));
} else {
return PackageHealthObserverImpact.USER_IMPACT_NONE;
}
}
//-----------
private static int getRescueLevel(int mitigationCount) {
if (mitigationCount == 1) {
return LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS;
} else if (mitigationCount == 2) {
return LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES;
} else if (mitigationCount == 3) {
return LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS;
} else if (mitigationCount == 4) {
return Math.min(getMaxRescueLevel(), LEVEL_WARM_REBOOT);
} else if (mitigationCount >= 5) {
return Math.min(getMaxRescueLevel(), LEVEL_FACTORY_RESET);
} else {
Slog.w(TAG, "Expected positive mitigation count, was " + mitigationCount);
return LEVEL_NONE;
}
}
//-----------
private static int mapRescueLevelToUserImpact(int rescueLevel) {
switch(rescueLevel) {
case LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS:
case LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES:
return PackageHealthObserverImpact.USER_IMPACT_LOW;
case LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS:
case LEVEL_WARM_REBOOT:
case LEVEL_FACTORY_RESET:
return PackageHealthObserverImpact.USER_IMPACT_HIGH;
default:
return PackageHealthObserverImpact.USER_IMPACT_NONE;
}
}
主要根據(jù)通過兩個 Observer
的 onHealthCheckFailed
方法來判斷package發(fā)生異常對于用戶的使用影響程度用多大
-
在回滾機制的判斷中,對于用戶的使用影響程度最多也只是中等水平
USER_IMPACT_MEDIUM
。 -
在
RescueParty
處理判斷中,對于用戶的使用影響程度是最大的,易發(fā)生USER_IMPACT_HIGH
3.3緩解處理 PackageHealthObserver.execute()
- RollbackPackageHealthObserver
- RescuePartyObserver.execute()
3.3.1RollbackPackageHealthObserver
- 當異常原因是:
Native Crash
將會最終會調(diào)用RollbackManagerService
處理,暫時先不岔開研究該服務(TODO) - 其他異常情況將調(diào)用可用的rollback進行處理也涉及上述服務,暫時略過
//RollbackPackageHealthObserver.java
@Override
public boolean execute(@Nullable VersionedPackage failedPackage,
@FailureReasons int rollbackReason, int mitigationCount) {
if (rollbackReason == PackageWatchdog.FAILURE_REASON_NATIVE_CRASH) {
mHandler.post(() -> rollbackAll());
return true;
}
RollbackInfo rollback = getAvailableRollback(failedPackage);
//...
mHandler.post(() -> rollbackPackage(rollback, failedPackage, rollbackReason));
// Assume rollback executed successfully
return true;
}
3.3.2. RescuePartyObserver.execute()
- 最終
getRescueLevel
會根據(jù)緩解計數(shù) 得到緊急程度,最終將會進入executeRescueLevelInternal
,主要分類如下:
execute
—executeRescueLevel
------executeRescueLevelInternal
詳細處理流程
@Override
public boolean execute(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason, int mitigationCount) {
//...
if (failureReason == PackageWatchdog.FAILURE_REASON_APP_CRASH
|| failureReason == PackageWatchdog.FAILURE_REASON_APP_NOT_RESPONDING) {
final int level = getRescueLevel(mitigationCount);
executeRescueLevel(mContext,
failedPackage == null ? null : failedPackage.getPackageName(), level);
return true;
} else {
return false;
}
}
//----------
private static void executeRescueLevel(Context context, @Nullable String failedPackage,
int level) {
Slog.w(TAG, "Attempting rescue level " + levelToString(level));
try {
executeRescueLevelInternal(context, level, failedPackage);
//...
}
//---------
private static void executeRescueLevelInternal(Context context, int level, @Nullable
String failedPackage) throws Exception {
FrameworkStatsLog.write(FrameworkStatsLog.RESCUE_PARTY_RESET_REPORTED, level);
// Try our best to reset all settings possible, and once finished
// rethrow any exception that we encountered
Exception res = null;
Runnable runnable;
Thread thread;
switch (level) {
case LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS:
//...部分異常處理無關(guān)代碼已省略
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_UNTRUSTED_DEFAULTS,
level);
resetDeviceConfig(context, /*isScoped=*/true, failedPackage);
break;
case LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES:
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_UNTRUSTED_CHANGES,
level);
resetDeviceConfig(context, /*isScoped=*/true, failedPackage);
break;
case LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS:
resetAllSettingsIfNecessary(context, Settings.RESET_MODE_TRUSTED_DEFAULTS,
level);
resetDeviceConfig(context, /*isScoped=*/false, failedPackage);
break;
case LEVEL_WARM_REBOOT:
// Request the reboot from a separate thread to avoid deadlock on PackageWatchdog
// when device shutting down.
SystemProperties.set(PROP_ATTEMPTING_REBOOT, "true");
runnable = () -> {
try {
PowerManager pm = context.getSystemService(PowerManager.class);
if (pm != null) {
pm.reboot(TAG);
}
//...異常處理省略
};
thread = new Thread(runnable);
thread.start();
break;
case LEVEL_FACTORY_RESET:
SystemProperties.set(PROP_ATTEMPTING_FACTORY_RESET, "true");
runnable = new Runnable() {
@Override
public void run() {
//...
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);
//..異常處理
};
thread = new Thread(runnable);
thread.start();
break;
//...異常處理省略
}
從上面的代碼流程中,我們可以看到,大約兩種走向,大致走向我們看3.4和第5點
3.4.executeRescueLevelInternal 對于各種mode的處理
3.4.1.LEVEL_RESET_SETTINGS_XXX三種mode的處理流程
- 1–LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS
- 2–LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES
- 3–LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS
從LEVEL_RESET_SETTINGS_ 這個前綴來看也是和重設(shè)Settings相關(guān)數(shù)據(jù)庫值相關(guān)操作,繼續(xù)往下看
主要將會調(diào)用以下兩個流程:
-
resetAllSettingsIfNecessary()
--重設(shè)異常package相關(guān)的數(shù)據(jù)庫屬性值 -
resetDeviceConfig()
–重設(shè)設(shè)備參數(shù)
即:異常package添加的數(shù)據(jù)庫屬性值我們要重設(shè),異常package修改的設(shè)備參數(shù)我們也要重設(shè)恢復為默認值(屬性表TODO)
3.4.1.1.resetAllSettingsIfNecessary 往下調(diào)用的流程:
RescueParty.resetAllSettingsIfNecessary()
—>Settings.Global.resetToDefaultsAsUser(…)
------>SettingsProvider.resetGlobalSetting()和resetSecureSetting()
--------->getResetModeEnforcingPermission,根據(jù)緊急程度,
--------->SettingsProvider.mutateSecureSetting()或mutateGlobalSetting()
------------>SettingsRegistry.resetSettingsLocked(SETTINGS_TYPE_SECURE,…)
------------>根據(jù)三種模式做相應判斷處理
--------------->notifyForSettingsChange(key, name)//修改數(shù)據(jù)庫屬性值
具體處理如下:
主要根據(jù)傳入的三種類型
GlobalSetting
最終將會調(diào)用到下面的 mutateGlobalSetting() 重設(shè)與該package相關(guān)的Global數(shù)據(jù)庫屬性值。
SecureSetting
最終將會調(diào)用到下面的 mutateSecureSetting() 重設(shè)與該package相關(guān)的Global數(shù)據(jù)庫屬性值。
mutateGlobalSetting()
和 mutateSecureSetting()
將最終調(diào)用 mSettingsRegistry.resetSettingsLocked(SETTINGS_TYPE_SECURE,..)
完成這一具體操作,不過函數(shù)內(nèi)部根據(jù) LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS..
這三種mode做了相應的區(qū)分,不過終究是為了重設(shè)與該fail package相關(guān)的Global或者Secure屬性值。
3.4.1.2.resetDeviceConfig 往下調(diào)用的流程:
下面的調(diào)用初步研究了一下涉及到 namespace
,相關(guān)我們可以具體參看一下 DeviceConfig.java
中對于各個 namespace
的定義,這些 namespace
主要涉及和系統(tǒng)相關(guān)的一些數(shù)據(jù)庫中的屬性值–也即設(shè)備配置參數(shù)
代碼調(diào)用流程:
resetDeviceConfig
—>resetAllAffectedNamespaces(…)或performScopedReset(…)
------>DeviceConfig.resetToDefaults(DEVICE_CONFIG_RESET_MODE, namespace)
流程也較為簡單,如上.
3.4.2.LEVEL_WARM_REBOOT 的重啟處理流程
- 設(shè)置屬性值
sys.attempting_factory_reset
屬性值為true
- 調(diào)用PowerMS.reboot(),進入重啟流程
SystemProperties.set(PROP_ATTEMPTING_REBOOT, “true”)
PowerManager pm = context.getSystemService(PowerManager.class);
pm.reboot(TAG);
3.4.3.LEVEL_FACTORY_RESET 重啟并擦除用戶數(shù)據(jù)
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);
代碼流程:
RecoverySystem.rebootPromptAndWipeUserData(…)
代碼最終也會走上面pm.reboot()的流程
我們主要關(guān)注的是 rebootPromptAndWipeUserData
這個函數(shù)是如何導致Factory Reset的
這里將會導致重啟,重啟流程如下
4.PowerManagerService觸發(fā)的重啟
log字樣:
標號:2 5 6 7將對應下面的函數(shù)流程<4.2PMS重啟函數(shù)流程>
2.將會打印Binder shutdown checkpoint recorded with pid=
5.將會打印 ShutdownThread Notifying thread to start shutdown longPressBehavior=
,顯示重啟dialog
6.做一些初始化相關(guān)例如喚醒鎖等
7.正真進入重啟,
1.將重啟原因?qū)懭?code> sys.shutdown.requested,前面的數(shù)字0或1代表是否是reboot
2. 將會dump一些重啟相關(guān)的log信息 Log:Logging pre-reboot information...
3. 將會發(fā)送關(guān)機廣播Intent.ACTION_SHUTDOWN
4. 廣播處理完后將關(guān)閉 AMS-am.shutdown(MAX_BROADCAST_TIME), Log:Shutting down activity manager..
5. 關(guān)閉其他服務: PMS\radios
- log打印:
Rebooting, reason:
和Performing low-level shutdown...
4.2PMS重啟函數(shù)流程
函數(shù)調(diào)用流程如下:
1.PowerManagerService.reboot(boolean confirm/是否彈出重啟框/, @Nullable String reason/重啟原因/, boolean wait/是否重啟完成后再返回/)
—>2.ShutdownCheckPoints.recordCheckPoint(Binder.getCallingPid(), reason)//檢查調(diào)用進程,將reason放入ShutdownCheckPoints.mCheckPoints
—>3.shutdownOrRebootInternal(HALT_MODE_REBOOT, confirm, reason, wait)
------>4. ShutdownThread.reboot(getUiContext(), reason, confirm)
--------->5.shutdownInner(context, confirm)
------------>6.beginShutdownSequence()
--------------->7.ShowdownThread.run()
------------------>8.rebootOrShutdown(mContext, mReboot, mReason)
--------------------->9.PowerManagerService.lowLevelReboot(reason)
------------------------>SystemProperties.set("sys.powerctl", "reboot," + reason)
--------------------->10.PowerManagerService.lowLevelShutdown(reason)
------------------------>SystemProperties.set("sys.powerctl", "shutdown," + reason)
具體代碼如下:
public void run() {
//...
String reason = (mReboot ? "1" : "0") + (mReason != null ? mReason : "");
SystemProperties.set(SHUTDOWN_ACTION_PROPERTY, reason);
//...
if (mRebootSafeMode) {
SystemProperties.set(REBOOT_SAFEMODE_PROPERTY, "1");
}
//...
Slog.i(TAG, "Logging pre-reboot information...");
PreRebootLogger.log(mContext);
//...
Log.i(TAG, "Sending shutdown broadcast...");
// 發(fā)送廣播
mActionDone = false;
Intent intent = new Intent(Intent.ACTION_SHUTDOWN);
intent.addFlags(Intent.FLAG_RECEIVER_FOREGROUND | Intent.FLAG_RECEIVER_REGISTERED_ONLY);
mContext.sendOrderedBroadcastAsUser(intent,
UserHandle.ALL, null, br, mHandler, 0, null, null);
//...發(fā)生的關(guān)機廣播是否超時處理
//...AMS關(guān)閉
am.shutdown(MAX_BROADCAST_TIME);
//... PMS關(guān)閉
pm.shutdown();
//...
uncrypt();
//...
// Remaining work will be done by init, including vold shutdown
rebootOrShutdown(mContext, mReboot, mReason);
}
4.2.1屬性服務得到sys.powerctl 流程如下:
. handle_property_set_fd() property_service.cpp
.—> HandlePropertySet()
.------> PropertySet()
.---------> PropertyChanged() init.cpp
.------------>ShutdownState.TriggerShutdown()
.--------------->SecondStageMain
------------------>HandlePowerctlMessage()
.--------------------> DoReboot() reboot.cpp
.----------------------->RebootSystem()
- 將會打印重啟相關(guān)的log:
Reboot start, reason: XXX, reboot_target: XXXX
-
shutdown
超時處理, log:Shutdown timeout: XXms
- 關(guān)閉各個服務,部分服務除外
- 關(guān)閉背光,清理
activities
- 關(guān)機動畫相關(guān)主要將開機機動畫顯示出來,可以在
adb shell setprop service.bootanim.exit 0
以及adb shell setprop service.bootanim.process 0
后再設(shè)置adb shell setprop ctl.start bootanim
即可以將開機動畫啟動起來,adb shell setprop ctl.stop bootanim
關(guān)閉開機動畫,原理類似 -
StopServicesAndLogViolations
停止服務并處理超時
static void DoReboot(unsigned int cmd, const std::string& reason, const std::string& reboot_target,
bool run_fsck) {
Timer t;
LOG(INFO) << "Reboot start, reason: " << reason << ", reboot_target: " << reboot_target;
bool is_thermal_shutdown = cmd == ANDROID_RB_THERMOFF;
//...shutdown 超時處理
//..
// 關(guān)閉各個服務
const std::set<std::string> to_starts{"watchdogd"};
std::set<std::string> stop_first;
for (const auto& s : ServiceList::GetInstance()) {
if (kDebuggingServices.count(s->name())) {
// keep debugging tools until non critical ones are all gone.
s->SetShutdownCritical();
} else if (to_starts.count(s->name())) {
if (auto result = s->Start(); !result.ok()) {
LOG(ERROR) << "Could not start shutdown 'to_start' service '" << s->name()
<< "': " << result.error();
}
s->SetShutdownCritical();
} else if (s->IsShutdownCritical()) {
// Start shutdown critical service if not started.
if (auto result = s->Start(); !result.ok()) {
LOG(ERROR) << "Could not start shutdown critical service '" << s->name()
<< "': " << result.error();
}
} else {
stop_first.insert(s->name());
}
}
// 關(guān)閉背光并清理activities
if (cmd == ANDROID_RB_POWEROFF || is_thermal_shutdown) {
TurnOffBacklight();
}
//所謂關(guān)機動畫相關(guān)
Service* boot_anim = ServiceList::GetInstance().FindService("bootanim");
Service* surface_flinger = ServiceList::GetInstance().FindService("surfaceflinger");
if (boot_anim != nullptr && surface_flinger != nullptr && surface_flinger->IsRunning()) {
bool do_shutdown_animation = GetBoolProperty("ro.init.shutdown_animation", false);
if (do_shutdown_animation) {
SetProperty("service.bootanim.exit", "0");
SetProperty("service.bootanim.progress", "0");
// Could be in the middle of animation. Stop and start so that it can pick
// up the right mode.
boot_anim->Stop();
}
for (const auto& service : ServiceList::GetInstance()) {
if (service->classnames().count("animation") == 0) {
continue;
}
// start all animation classes if stopped.
if (do_shutdown_animation) {
service->Start();
}
service->SetShutdownCritical(); // will not check animation class separately
}
if (do_shutdown_animation) {
boot_anim->Start();
surface_flinger->SetShutdownCritical();
boot_anim->SetShutdownCritical();
}
}
// 關(guān)機步驟
// 1. terminate all services except shutdown critical ones. wait for delay to finish
if (shutdown_timeout > 0ms) {
StopServicesAndLogViolations(stop_first, shutdown_timeout / 2, true /* SIGTERM */);
}
// Send SIGKILL to ones that didn't terminate cleanly.
StopServicesAndLogViolations(stop_first, 0ms, false /* SIGKILL */);
SubcontextTerminate();
// Reap subcontext pids.
ReapAnyOutstandingChildren();
// 3. send volume abort_fuse and volume shutdown to vold
Service* vold_service = ServiceList::GetInstance().FindService("vold");
if (vold_service != nullptr && vold_service->IsRunning()) {
// Manually abort FUSE connections, since the FUSE daemon is already dead
// at this point, and unmounting it might hang.
CallVdc("volume", "abort_fuse");
CallVdc("volume", "shutdown");
vold_service->Stop();
} else {
LOG(INFO) << "vold not running, skipping vold shutdown";
}
// logcat stopped here
StopServices(kDebuggingServices, 0ms, false /* SIGKILL */);
// 4. sync, try umount, and optionally run fsck for user shutdown
{
Timer sync_timer;
LOG(INFO) << "sync() before umount...";
sync();
LOG(INFO) << "sync() before umount took" << sync_timer;
}
// 5. drop caches and disable zram backing device, if exist
KillZramBackingDevice();
LOG(INFO) << "Ready to unmount apexes. So far shutdown sequence took " << t;
// 6. unmount active apexes, otherwise they might prevent clean unmount of /data.
if (auto ret = UnmountAllApexes(); !ret.ok()) {
LOG(ERROR) << ret.error();
}
UmountStat stat =
TryUmountAndFsck(cmd, run_fsck, shutdown_timeout - t.duration(), &reboot_semaphore);
// Follow what linux shutdown is doing: one more sync with little bit delay
{
Timer sync_timer;
LOG(INFO) << "sync() after umount...";
sync();
LOG(INFO) << "sync() after umount took" << sync_timer;
}
if (!is_thermal_shutdown) std::this_thread::sleep_for(100ms);
LogShutdownTime(stat, &t);
// Send signal to terminate reboot monitor thread.
reboot_monitor_run = false;
sem_post(&reboot_semaphore);
// Reboot regardless of umount status. If umount fails, fsck after reboot will fix it.
RebootSystem(cmd, reboot_target);
abort();
}
5.rebootPromptAndWipeUserData()
這將進入Recovery
模式,代碼流程如下:
—RecoverySystem.rebootPromptAndWipeUserData(context, TAG)
------1.vold.abortChanges(“rescueparty”, false)
---------bootCommand(context, null, “–prompt_and_wipe_data”, reasonArg, localeArg)
------------VoldNativeService.abortChanges()
---------------Checkpoint.cp_abortChanges
------------------abort_metadata_file()
------------------android_reboot(ANDROID_RB_RESTART2, 0, message.c_str())
---------------------android_reboot.android_reboot()// 執(zhí)行 sys.powerctl reboot 完成重啟
-----or-2.RecoverySystem.bootCommand(context, null, “–prompt_and_wipe_data”, reasonArg, localeArg);
---------RecoverySystemService.rebootRecoveryWithCommand()
------------setupOrClearBcb(true, command)
---------------mInjector.systemPropertiesSet(“ctl.start”, “setup-bcb”)//執(zhí)行ctl.start setup-bcb 啟動該服務,并通過init進程開啟一個socket
---------------socket.sendCommand()/通過socket 將–prompt_and_wipe_data", reasonArg, localeArg組合成的命令發(fā)給uncrypt service
//bootable/recovery/uncrypt/uncrypt.rc
service setup-bcb /system/bin/uncrypt --setup-bcb
class main
socket uncrypt stream 600 system system
disabled
oneshot
我們這里即可看到,system_server中的RecoverySystemService 和 uncrypt 進程使用socket通信,通信的內(nèi)容即發(fā)送字符串 prompt_and_wipe_data,以及其他arg參數(shù).交給
setup-bcb該服務進行處理.
5.1 uncrypt.cpp
詳細了解 uncrypt.cpp 可參考鏈接
我們這里只了解它的一部分功能.
uncrypt.cpp中有三個服務:?SETUP_BCB,CLEAR_BCB,?UNCRYPT,我們這里需要進入Recovery模式即需要進行了解SETUP_BCB.
BCB是什么呢?如下介紹
5.1.1.BootLoader?Control?Block-BCB
官方注釋: 存于flash中一個內(nèi)容塊,用于recovery和bootloader通信
BCB:BootLoader?Control?Block,?正常啟動.,若啟動過程中用戶沒有按下任何組合健,BootLoader?會讀取位于misc分區(qū)的啟動信息控制塊,即BCB,他是一個結(jié)構(gòu)體,存放著啟動命令command?,根據(jù)不同的命令,系統(tǒng)可以進入三種不同的啟動模式.
我們可以看一下bootloader_message結(jié)構(gòu)體的注釋即可知道
struct bootloader_message {
char command[32];
char status[32];
char recovery[768];
char stage[32];
char reserved[1184];
};
- command 字段中存儲的是命令,它有以下幾個可能值:
- boot-recovery:系統(tǒng)將啟動進入Recovery模式
- update-radia 或者 update-hboot:系統(tǒng)將啟動進入更新firmware的模式,這個更新過程由bootloader完成
- NULL:空值,系統(tǒng)將啟動進入Main System主系統(tǒng),正常啟動。
- status 字段存儲的是更新的結(jié)果。更新結(jié)束后,由Recovery或者Bootloader將更新結(jié)果寫入到這個字段中。
- recovery 字段存放的是recovry模塊的啟動參數(shù),一般包括升級包路徑。其存儲結(jié)構(gòu)如下:第一行存放字符串“recovery”,第二行存放路徑信息“–update_package=/mnt/sdcard/update.zip”等。 因此,參數(shù)之間是以“\n”分割的。
這里涉及到系統(tǒng)啟動BootLoader相關(guān)知識,我們簡單理解一下:
當手機沒有按下任何鍵啟動時,BootLoader會讀取啟動控制信息塊BCB,BCB中的command命令決定了系統(tǒng)啟動進入什么狀態(tài),根據(jù)command的值,已經(jīng)command值為null可分為三種模式,
第一種:命令:boot-recovery?進入Recover模式,
第二種:命令:update-radia或者updata-hboot,會進入固件升級,
第三種,命令:null,系統(tǒng)會進入正常啟動流程
5.2 SETUP_BCB 流程
uncrypt.cpp
—uncrypt.main()
------setup_bcb(socket_fd)
---------write_bootloader_message(options, &err)
---------write_wipe_package(wipe_package, &err)
最后兩條write命令將會拼裝的command及recovery信息寫入到BCB中,待系統(tǒng)啟動到init進程后還會啟動Recovery進程(服務),Recovery服務中還會根據(jù)get_bootloader_message()得到BCB中相關(guān)的信息,做一些例如:是否升級apk的更新包/是否擦除data分區(qū)和cache分區(qū),結(jié)束后調(diào)用finish_recovery()?才繼續(xù)走后面的系統(tǒng)啟動流程.
6.結(jié)論如下:
Rescueparty
發(fā)起的進入 recovery
,
由以下任意一條件觸發(fā):
-
system_server
在 5 分鐘內(nèi)重啟 5 次以上。 - 永久性系統(tǒng)應用在 30 秒內(nèi)崩潰 5 次以上。
該種失敗會在Recovery
模式最下方打印REASON:Rescueparty
?該種失敗需要進recovery
之前的log進行定位,屬于Android 上層的問題
6.1 確認方法
log中搜索如下字樣即可確認:
am_process_crashed_too_much
?PackageManager:?Finished?rescue?level?FACTORY_RESET
uncrypt?:?
ShutdownThread:
7.其他例子
systemui
不斷 fatal
導致 recovery
重啟
06-10 14:07:20.994 976 1459 W ActivityManager: Process com.android.systemui has crashed too many times, killing! Reason: crashed quickly
06-10 14:07:20.994 976 1459 I am_process_crashed_too_much: [0,com.android.systemui,10143]
06-10 14:07:21.113 976 1043 I rescue_success: 5
06-10 14:07:21.118 976 1043 D PackageManager: Finished rescue level FACTORY_RESET for package com.android.systemui
06-10 14:07:21.119 976 1043 I pm_critical_info: Finished rescue level FACTORY_RESET for package com.android.systemui
06-10 14:07:22.142 5080 5080 I uncrypt : received command: [--prompt_and_wipe_data
06-10 14:07:22.142 5080 5080 I uncrypt : --reason=RescueParty
06-10 14:07:22.142 5080 5080 I uncrypt : --locale=zh_CN_#Hans
06-10 14:07:22.142 5080 5080 I uncrypt : ] (65)
06-10 14:07:22.143 5194 5194 E c.silent.reboo: Not starting debugger since process cannot load the jdwp agent.
06-10 14:07:22.146 5080 5080 I uncrypt : received 0, exiting now
06-10 14:07:22.150 976 5074 I RecoverySystemService: uncrypt setup bcb successfully finished.
06-10 14:07:22.163 976 5074 V ShutdownCheckPoints: Binder shutdown checkpoint recorded with pid=976
06-10 14:07:22.163 976 5074 D PowerManagerService: reboot the device , UID : 1000 , PID : 976 , reason : recovery , confirm = false , wait = true
06-10 14:07:22.169 976 1035 D ShutdownThread: reboot reason : recovery
06-10 14:07:22.171 976 1035 D ShutdownThread: Notifying thread to start shutdown longPressBehavior=1
06-10 14:07:22.176 976 1035 D ShutdownThread: Attempting to use SysUI shutdown UI
06-10 14:07:22.177 976 1035 D ShutdownThread: SysUI is unavailable
06-10 14:07:22.261 976 5218 I ShutdownThread: Logging pre-reboot information...
06-10 14:07:22.263 976 5218 I ShutdownThread: Sending shutdown broadcast...
06-10 14:07:22.853 976 5218 I ShutdownThread: Shutting down Bluetooth
06-10 14:07:22.855 976 5218 I ShutdownThread: Shutting down activity manager...
06-10 14:07:23.468 976 5218 I ShutdownThread: Shutting down package manager...
-------------------往前即可看到fatal信息
06-10 14:07:20.960 4787 4787 D AndroidRuntime: Shutting down VM
06-10 14:07:20.962 4787 4787 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:20.962 4787 4787 E AndroidRuntime: Process: com.android.systemui, PID: 4787
06-10 14:07:20.962 4787 4787 E AndroidRuntime: java.lang.NoClassDefFoundError: Failed resolution of: Lokhttp3/OkHttpClient$Builder;
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.squareup.picasso.OkHttp3Downloader.<init>(OkHttp3Downloader.java:71)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.squareup.picasso.OkHttp3Downloader.<init>(OkHttp3Downloader.java:50)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.squareup.picasso.OkHttp3Downloader.<init>(OkHttp3Downloader.java:40)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.squareup.picasso.Picasso$Builder.build(Picasso.java:848)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.squareup.picasso.Picasso.get(Picasso.java:683)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.NotificationMediaManager.finishUpdateMediaMetaData(NotificationMediaManager.java:910)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.NotificationMediaManager.updateMediaMetaData(NotificationMediaManager.java:832)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.NotificationMediaManager.screenTurnedOff(NotificationMediaManager.java:1395)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.phone.StatusBar$16.onReceive(StatusBar.java:3432)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.phone.StatusBar.makeStatusBarView(StatusBar.java:1622)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.phone.StatusBar.createAndAddWindows(StatusBar.java:3169)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.statusbar.phone.StatusBar.start(StatusBar.java:1099)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.SystemUIApplication.startServicesIfNeeded(SystemUIApplication.java:228)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.SystemUIApplication.startServicesIfNeeded(SystemUIApplication.java:167)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.systemui.keyguard.KeyguardService.onCreate(KeyguardService.java:120)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.app.ActivityThread.handleCreateService(ActivityThread.java:4547)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.app.ActivityThread.access$1700(ActivityThread.java:273)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2119)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:106)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:201)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.os.Looper.loop(Looper.java:288)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:7981)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:553)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1004)
06-10 14:07:20.962 4787 4787 E AndroidRuntime: Caused by: java.lang.ClassNotFoundException: okhttp3.OkHttpClient$Builder
06-10 14:07:20.962 4787 4787 E AndroidRuntime: ... 25 more
com.android.systemui 20s內(nèi)發(fā)生發(fā)生十多次?fatal文章來源:http://www.zghlxwxcb.cn/news/detail-487661.html
06-10 14:07:01.110 1208 1208 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:04.100 3510 3510 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:06.447 3717 3717 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:08.578 3845 3845 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:11.035 3935 3935 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:13.319 4127 4127 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:15.668 4334 4334 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:18.194 4584 4584 E AndroidRuntime: FATAL EXCEPTION: main
06-10 14:07:20.962 4787 4787 E AndroidRuntime: FATAL EXCEPTION: main
參考
BCB
uncrypt流程文章來源地址http://www.zghlxwxcb.cn/news/detail-487661.html
到了這里,關(guān)于手機進入Recovery之 RescueParty的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!