目录
- 一、背景
- 二、App端Crash注册
- 2.1 commonInit()
- 2.2 KillApplicationHandler 类
- 2.2.1 ensureLogging()
- 2.2.2 ApplicationErrorReport
- 三、AMS端处理崩溃逻辑
- 3.1 AMS.handleApplicationCrash
- 3.1.1 AMS.handleApplicationCrashInner()
- 3.2 addErrorToDropBox()
- 3.3 AppErrors.crashApplication()
- 3.3.1 AppErrors.crashApplicationInner()
- 3.4 makeAppCrashingLocked()
- 3.4.1 ProcessRecord.startAppProblemLocked()
- 3.4.2 WindowProcessController.stopFreezingActivities()
- 3.4.3 handleAppCrashLocked()
- 3.5 小结
- 四、 mUiHandler发送 SHOW_ERROR_UI_MSG
- 4.1 handleShowAppErrorUi()
- 4.2 Process.killProcess()
- 五、Binder服务死亡通知
- 5.1 AMS.attachApplicationLocked()
- 5.2 AppDeathRecipient.binderDied()
- 5.2.1 appDiedLocked()
- 5.2.2 handleAppDiedLocked()
- 5.3 cleanUpApplicationRecordLocked()
- 5.4 atms.handleAppDied()
- 小结
一、背景
当Android系统发生native crash
时,在日志台打印日志和生成tombstone_xxx文件,会通过 socket
通知 AMS 从而进入到Java crash侧 处理流程中。 同时,当发生Java crash
时,系统会捕捉到该crash,从而也进入到Java crash的处理流程。
由此可见,Java crash
处理流程是非常重要的。 native crash流程上篇文章已经分析过了,今天再来看看Java crash的处理流程。
二、App端Crash注册
不管是系统进程还是App进程,启动的时候都会走到这里。
2.1 commonInit()
RuntimeInit.java
@UnsupportedAppUsage
protected static final void commonInit() {
if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
LoggingHandler loggingHandler = new LoggingHandler();
RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);
// 注册处理器
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
}
注册
杀掉App进程的处理器 KillApplicationHandler
。
2.2 KillApplicationHandler 类
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
private final LoggingHandler mLoggingHandler;
/**
* Create a new KillApplicationHandler that follows the given LoggingHandler.
* If {@link #uncaughtException(Thread, Throwable) uncaughtException} is called
* on the created instance without {@code loggingHandler} having been triggered,
* {@link LoggingHandler#uncaughtException(Thread, Throwable)
* loggingHandler.uncaughtException} will be called first.
*
* @param loggingHandler the {@link LoggingHandler} expected to have run before
* this instance's {@link #uncaughtException(Thread, Throwable) uncaughtException}
* is being called.
*/
public KillApplicationHandler(LoggingHandler loggingHandler) {
this.mLoggingHandler = Objects.requireNonNull(loggingHandler);
}
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
// 在日志台打印崩溃时的日志
ensureLogging(t, e);
// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
if (mCrashing) return;
mCrashing = true;
// Try to end profiling. If a profiler is running at this point, and we kill the
// process (below), the in-memory buffer will be lost. So try to stop, which will
// flush the buffer. (This makes method trace profiling useful to debug crashes.)
if (ActivityThread.currentActivityThread() != null) {
ActivityThread.currentActivityThread().stopProfiling();
}
// Bring up crash dialog, wait for it to be dismissed
//弹出奔溃对话框
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.
// 最终关闭kill调进程
Process.killProcess(Process.myPid());
System.exit(10);
}
}
private void ensureLogging(Thread t, Throwable e) {
if (!mLoggingHandler.mTriggered) {
try {
mLoggingHandler.uncaughtException(t, e);
} catch (Throwable loggingThrowable) {
// Ignored.
}
}
}
}
职责:
- 在日志台打印崩溃日志
- 调用 AMS的handleApplicationCrash()方法
- 在finally中杀掉App进程
2.2.1 ensureLogging()
内部调用了 LoggingHandler.uncaughtException()
方法。LoggingHandler 也实现了 Thread.UncaughtExceptionHandler接口。 重写了 uncaughtException() 方法。
private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
public volatile boolean mTriggered = false;
@Override
public void uncaughtException(Thread t, Throwable e) {
mTriggered = true;
// Don't re-enter if KillApplicationHandler has already run
if (mCrashing) return;
// mApplicationObject is null for non-zygote java programs (e.g. "am")
// There are also apps running with the system UID. We don't want the
// first clause in either of these two cases, only for system_server.
if (mApplicationObject == null && (Process.SYSTEM_UID == Process.myUid())) {
Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
} else {
// 准备拼接 FATAL EXCEPTION ,打印到控制台
StringBuilder message = new StringBuilder();
//
// The "FATAL EXCEPTION" string is still used on Android even though
// apps can set a custom UncaughtExceptionHandler that renders uncaught
// exceptions non-fatal.
message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
final String processName = ActivityThread.currentProcessName();
if (processName != null) {
// 拼上进程名字
message.append("Process: ").append(processName).append(", ");
}
// 进程id
message.append("PID: ").append(Process.myPid());
// 打印message和 e异常信息
Clog_e(TAG, message.toString(), e);
}
}
}
拼接 FATAL EXCEPTION
开头的字符串,同时打印崩溃的信息。
因此,可以通过过滤出 FATAL
EXCEPTION精准定位崩溃的日志。
2.2.2 ApplicationErrorReport
new ApplicationErrorReport.ParcelableCrashInfo(e) 创建了一个crashinfo对象。 这个对象其实就是从throwable中 解析得到的。
App端打印了日志后,就进入到AMS端
的处理逻辑中。
三、AMS端处理崩溃逻辑
3.1 AMS.handleApplicationCrash
public void handleApplicationCrash(IBinder app,
ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
//找到 ProcessRecord对象
ProcessRecord r = findAppProcess(app, "Crash");
// app=null,表示system_server进程
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
该方法是 RuntimeInit
用来上报app崩溃时调用。 当这个方法返回后,App进程将会退出。
- 找出崩溃进程对应的 ProcessRecord对象,如果app为空,则是system server进程。
- 继续调用
handleApplicationCrashInner()
。
3.1.1 AMS.handleApplicationCrashInner()
/* Native crash reporting uses this inner version because it needs to be somewhat
* decoupled from the AM-managed cleanup lifecycle
*/
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {
// ...
final int relaunchReason = r == null ? RELAUNCH_REASON_NONE
: r.getWindowProcessController().computeRelaunchReason();
final String relaunchReasonString = relaunchReasonToString(relaunchReason);
if (crashInfo.crashTag == null) {
crashInfo.crashTag = relaunchReasonString;
} else {
crashInfo.crashTag = crashInfo.crashTag + " " + relaunchReasonString;
}
// 1 写入崩溃信息到Dropbox
addErrorToDropBox(
eventType, r, processName, null, null, null, null, null, null, crashInfo);
// 2 调用mAppErrors 的crashApplication方法
mAppErrors.crashApplication(r, crashInfo);
}
这个方法不仅Java crash
回调,Native crash也会
通过AMS的之前注册的socket服务,调用到这里。可以参考Native crash流程。
- 写入崩溃信息到
Dropbox
- 继续调用 mAppErrors 的
crashApplication()
。
3.2 addErrorToDropBox()
把crash、WTF、ANR
的描述写到drop box中。
public void addErrorToDropBox(String eventType,
ProcessRecord process, String processName, String activityShortComponentName,
String parentShortComponentName, ProcessRecord parentProcess,
String subject, final String report, final File dataFile,
final ApplicationErrorReport.CrashInfo crashInfo) {
// Bail early if not published yet
if (ServiceManager.getService(Context.DROPBOX_SERVICE) == null) return;
// 获取 DBMS服务
final DropBoxManager dbox = mContext.getSystemService(DropBoxManager.class);
// Exit early if the dropbox isn't configured to accept this report type.
// 确定错误类型
final String dropboxTag = processClass(process) + "_" + eventType;
if (dbox == null || !dbox.isTagEnabled(dropboxTag)) return;
// Rate-limit how often we're willing to do the heavy lifting below to
// collect and record logs; currently 5 logs per 10 second period.
final long now = SystemClock.elapsedRealtime();
if (now - mWtfClusterStart > 10 * DateUtils.SECOND_IN_MILLIS) {
mWtfClusterStart = now;
mWtfClusterCount = 1;
} else {
if (mWtfClusterCount++ >= 5) return;
}
// 开始拼接错误信息
final StringBuilder sb = new StringBuilder(1024);
appendDropBoxProcessHeaders(process, processName, sb);
if (process != null) {
// 是否前台
sb.append("Foreground: ")
.append(process.isInterestingToUserLocked() ? "Yes" : "No")
.append("\n");
}
if (activityShortComponentName != null) {
sb.append("Activity: ").append(activityShortComponentName).append("\n");
}
if (parentShortComponentName != null) {
if (parentProcess != null && parentProcess.pid != process.pid) {
sb.append("Parent-Process: ").append(parentProcess.processName).append("\n");
}
if (!parentShortComponentName.equals(activityShortComponentName)) {
sb.append("Parent-Activity: ").append(parentShortComponentName).append("\n");
}
}
if (subject != null) {
sb.append("Subject: ").append(subject).append("\n");
}
sb.append("Build: ").append(Build.FINGERPRINT).append("\n");
if (Debug.isDebuggerConnected()) {
sb.append("Debugger: Connected\n");
}
if (crashInfo != null && crashInfo.crashTag != null && !crashInfo.crashTag.isEmpty()) {
sb.append("Crash-Tag: ").append(crashInfo.crashTag).append("\n");
}
sb.append("\n");
// Do the rest in a worker thread to avoid blocking the caller on I/O
// (After this point, we shouldn't access AMS internal data structures.)
// dump错误信息
Thread worker = new Thread("Error dump: " + dropboxTag) {
@Override
public void run() {
if (report != null) {
sb.append(report);
}
String setting = Settings.Global.ERROR_LOGCAT_PREFIX + dropboxTag;
int lines = Settings.Global.getInt(mContext.getContentResolver(), setting, 0);
int maxDataFileSize = DROPBOX_MAX_SIZE - sb.length()
- lines * RESERVED_BYTES_PER_LOGCAT_LINE;
if (dataFile != null && maxDataFileSize > 0) {
try {
sb.append(FileUtils.readTextFile(dataFile, maxDataFileSize,
"\n\n[[TRUNCATED]]"));
} catch (IOException e) {
Slog.e(TAG, "Error reading " + dataFile, e);
}
}
if (crashInfo != null && crashInfo.stackTrace != null) {
sb.append(crashInfo.stackTrace);
}
if (lines > 0) {
sb.append("\n");
// Merge several logcat streams, and take the last N lines
InputStreamReader input = null;
try {
java.lang.Process logcat = new ProcessBuilder(
"/system/bin/timeout", "-k", "15s", "10s",
"/system/bin/logcat", "-v", "threadtime", "-b", "events", "-b", "system",
"-b", "main", "-b", "crash", "-t", String.valueOf(lines))
.redirectErrorStream(true).start();
try { logcat.getOutputStream().close(); } catch (IOException e) {}
try { logcat.getErrorStream().close(); } catch (IOException e) {}
input = new InputStreamReader(logcat.getInputStream());
int num;
char[] buf = new char[8192];
while ((num = input.read(buf)) > 0) sb.append(buf, 0, num);
} catch (IOException e) {
Slog.e(TAG, "Error running logcat", e);
} finally {
if (input != null) try { input.close(); } catch (IOException e) {}
}
}
dbox.addText(dropboxTag, sb.toString());
}
};
if (process == null) {
// If process is null, we are being called from some internal code
// and may be about to die -- run this synchronously.
final int oldMask = StrictMode.allowThreadDiskWritesMask();
try {
// 直接在当前线程执行
worker.run();
} finally {
StrictMode.setThreadPolicyMask(oldMask);
}
} else {
// 开个新的线程执行
worker.start();
}
}
dropbox是system-server进程在 StartOtherServices中注册的服务DropBoxManager。它会记录系统的关键log信息,用来debug 调试。在ServiceManager 中的注册名字为 dropbox。 dropbox服务的数据保存在 /data/system/dropbox/中。
dropbox 支持保存的错误类型为:
- anr 进程发生未响应
- watchdog 进程触发watchdog
- crash 进程发生java崩溃
- native_crash 进程发生native崩溃
- wtf 进程发生严重错误
- lowmem 进程内存不足
写入到Dropbox文件后,继续看看 AppErrors.crashApplication()方法:
3.3 AppErrors.crashApplication()
AppErrors.java
void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
final int callingPid = Binder.getCallingPid();
final int callingUid = Binder.getCallingUid();
final long origId = Binder.clearCallingIdentity();
try {
crashApplicationInner(r, crashInfo, callingPid, callingUid);
} finally {
Binder.restoreCallingIdentity(origId);
}
}
3.3.1 AppErrors.crashApplicationInner()
AppErrors.java
void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo,
int callingPid, int callingUid) {
long timeMillis = System.currentTimeMillis();
String shortMsg = crashInfo.exceptionClassName;
String longMsg = crashInfo.exceptionMessage;
String stackTrace = crashInfo.stackTrace;
if (shortMsg != null && longMsg != null) {
longMsg = shortMsg + ": " + longMsg;
} else if (shortMsg != null) {
longMsg = shortMsg;
}
// ...
final int relaunchReason = r != null
? r.getWindowProcessController().computeRelaunchReason() : RELAUNCH_REASON_NONE;
AppErrorResult result = new AppErrorResult();
int taskId;
synchronized (mService) {
// ...
// If we can't identify the process or it's already exceeded its crash quota,
// quit right away without showing a crash dialog.
// 继续调用 makeAppCrashingLocked()
if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace, data)) {
return;
}
AppErrorDialog.Data data = new AppErrorDialog.Data();
data.result = result;
data.proc = r;
final Message msg = Message.obtain();
msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;
taskId = data.taskId;
msg.obj = data;
// 发送消息,弹出crash对话框,等待用户选择
mService.mUiHandler.sendMessage(msg);
}
// 得到用户选择结果
int res = result.get();
Intent appErrorIntent = null;
MetricsLogger.action(mContext, MetricsProto.MetricsEvent.ACTION_APP_CRASH, res);
// 如果是超时或者取消,则当成是强制退出
if (res == AppErrorDialog.TIMEOUT || res == AppErrorDialog.CANCEL) {
res = AppErrorDialog.FORCE_QUIT;
}
synchronized (mService) {
if (res == AppErrorDialog.MUTE) {
stopReportingCrashesLocked(r);
}
// 如果是重新启动
if (res == AppErrorDialog.RESTART) {
mService.mProcessList.removeProcessLocked(r, false, true, "crash");
if (taskId != INVALID_TASK_ID) {
try {
//1. 从最近的任务列表中找到崩溃进程,再次启动
mService.startActivityFromRecents(taskId,
ActivityOptions.makeBasic().toBundle());
} catch (IllegalArgumentException e) {
// Hmm...that didn't work. Task should either be in recents or associated
// with a stack.
Slog.e(TAG, "Could not restart taskId=" + taskId, e);
}
}
}
// 如果是退出
if (res == AppErrorDialog.FORCE_QUIT) {
long orig = Binder.clearCallingIdentity();
try {
// Kill it with fire!
// 杀掉这个进程
mService.mAtmInternal.onHandleAppCrash(r.getWindowProcessController());
if (!r.isPersistent()) {
mService.mProcessList.removeProcessLocked(r, false, false, "crash");
mService.mAtmInternal.resumeTopActivities(false /* scheduleIdle */);
}
} finally {
Binder.restoreCallingIdentity(orig);
}
}
// 如果是显示应用信息
if (res == AppErrorDialog.APP_INFO) {
appErrorIntent = new Intent(Settings.ACTION_APPLICATION_DETAILS_SETTINGS);
appErrorIntent.setData(Uri.parse("package:" + r.info.packageName));
appErrorIntent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
}
if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) {
appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo);
}
if (r != null && !r.isolated && res != AppErrorDialog.RESTART) {
// XXX Can't keep track of crash time for isolated processes,
// since they don't have a persistent identity.
mProcessCrashTimes.put(r.info.processName, r.uid,
SystemClock.uptimeMillis());
}
}
if (appErrorIntent != null) {
try {
// 2. 启动一个系统页面的intent 来显示应用信息
mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId));
} catch (ActivityNotFoundException e) {
Slog.w(TAG, "bug report receiver dissappeared", e);
}
}
}
职责:
继续调用 makeAppCrashingLocked()
发送 SHOW_ERROR_UI_MSG
消息,根据错误信息弹出crash对话框
,等待用户选择
- 如果选择重新启动,则从最近任务列表中找到崩溃进程,再次拉起
- 如果选择强制退出,则杀掉app,进入kill流程
- 如果选择显示应用信息,则启动系统页面的intent,打开应用详情页面
我们先来看看 makeAppCrashingLocked()
方法:
3.4 makeAppCrashingLocked()
private boolean makeAppCrashingLocked(ProcessRecord app,
String shortMsg, String longMsg, String stackTrace, AppErrorDialog.Data data) {
app.setCrashing(true);
// 封装崩溃信息到 ProcessErrorStateInfo 中
app.crashingReport = generateProcessError(app,
ActivityManager.ProcessErrorStateInfo.CRASHED, null, shortMsg, longMsg, stackTrace);
// 获取当前user的 error receiver;停止广播接收
app.startAppProblemLocked();
// 停是冻结屏幕
app.getWindowProcessController().stopFreezingActivities();
// 继续调用 handleAppCrashLocked
return handleAppCrashLocked(app, "force-crash" /*reason*/, shortMsg, longMsg, stackTrace,
data);
}
- 封装崩溃信息到 ProcessErrorStateInfo 中
- 获取当前user的 error receiver;停止广播接收
- 停是冻结屏幕
- 继续调用 handleAppCrashLocked()
3.4.1 ProcessRecord.startAppProblemLocked()
ProcessRecord.java
void startAppProblemLocked() {
// If this app is not running under the current user, then we can't give it a report button
// because that would require launching the report UI under a different user.
errorReportReceiver = null;
for (int userId : mService.mUserController.getCurrentProfileIds()) {
if (this.userId == userId) {
// 找到当前用户的error receiver
errorReportReceiver = ApplicationErrorReport.getErrorReportReceiver(
mService.mContext, info.packageName, info.flags);
}
}
// 停止接收广播
mService.skipCurrentReceiverLocked(this);
}
//
void skipCurrentReceiverLocked(ProcessRecord app) {
for (BroadcastQueue queue : mBroadcastQueues) {
queue.skipCurrentReceiverLocked(app);
}
}
private void skipReceiverLocked(BroadcastRecord r) {
logBroadcastReceiverDiscardLocked(r);
// 停止广播接收
finishReceiverLocked(r, r.resultCode, r.resultData,
r.resultExtras, r.resultAbort, false);
scheduleBroadcastsLocked();
}
- 找到当前用户的error receiver 最终会返回 注册 Intent.ACTION_APP_ERROR的ActivityComponent。
- 停止接收广播
3.4.2 WindowProcessController.stopFreezingActivities()
WindowProcessController.java
public void stopFreezingActivities() {
synchronized (mAtm.mGlobalLock) {
int i = mActivities.size();
while (i > 0) {
i--;
// mActivities存储的类型为 ActivityRecord
mActivities.get(i).stopFreezingScreenLocked(true);
}
}
}
ActivityRecord.stopFreezingScreenLocked()
ActivityRecord.java
public void stopFreezingScreenLocked(boolean force) {
if (force || frozenBeforeDestroy) {
frozenBeforeDestroy = false;
if (mAppWindowToken == null) {
return;
}
mAppWindowToken.stopFreezingScreen(true, force);
}
}
最终调到 AMS的 stopFreezingDisplayLocked() 方法来冻结屏幕。
3.4.3 handleAppCrashLocked()
boolean handleAppCrashLocked(ProcessRecord app, String reason,
String shortMsg, String longMsg, String stackTrace, AppErrorDialog.Data data) {
final long now = SystemClock.uptimeMillis();
final boolean showBackground = Settings.Secure.getInt(mContext.getContentResolver(),
Settings.Secure.ANR_SHOW_BACKGROUND, 0) != 0;
final boolean procIsBoundForeground =
(app.getCurProcState() == ActivityManager.PROCESS_STATE_BOUND_FOREGROUND_SERVICE);
// 确定崩溃的时间
Long crashTime;
Long crashTimePersistent;
boolean tryAgain = false;
if (!app.isolated) {
crashTime = mProcessCrashTimes.get(app.info.processName, app.uid);
crashTimePersistent = mProcessCrashTimesPersistent.get(app.info.processName, app.uid);
} else {
crashTime = crashTimePersistent = null;
}
// Bump up the crash count of any services currently running in the proc.
// 增加ServiceRecord中crashCount
for (int i = app.services.size() - 1; i >= 0; i--) {
// Any services running in the application need to be placed
// back in the pending list.
ServiceRecord sr = app.services.valueAt(i);
// If the service was restarted a while ago, then reset crash count, else increment it.
if (now > sr.restartTime + ProcessList.MIN_CRASH_INTERVAL) {
sr.crashCount = 1;
} else {
sr.crashCount++;
}
// Allow restarting for started or bound foreground services that are crashing.
// This includes wallpapers.
if (sr.crashCount < mService.mConstants.BOUND_SERVICE_MAX_CRASH_RETRY
&& (sr.isForeground || procIsBoundForeground)) {
tryAgain = true;
}
}
// 同一个进程,如果连续两次崩溃的间隔小于 一分钟,则认为崩溃过于频繁
if (crashTime != null && now < crashTime + ProcessList.MIN_CRASH_INTERVAL) {
// The process crashed again very quickly. If it was a bound foreground service, let's
// try to restart again in a while, otherwise the process loses!
Slog.w(TAG, "Process " + app.info.processName
+ " has crashed too many times: killing!");
EventLog.writeEvent(EventLogTags.AM_PROCESS_CRASHED_TOO_MUCH,
app.userId, app.info.processName, app.uid);
// 2.8.1 回调 atm的onHandleAppCrash
mService.mAtmInternal.onHandleAppCrash(app.getWindowProcessController());
if (!app.isPersistent()) {
// 如果不是persistent进程,则不再重启,除非用户主动触发
// We don't want to start this process again until the user
// explicitly does so... but for persistent process, we really
// need to keep it running. If a persistent process is actually
// repeatedly crashing, then badness for everyone.
if (!app.isolated) {
// XXX We don't have a way to mark isolated processes
// as bad, since they don't have a peristent identity.
mBadProcesses.put(app.info.processName, app.uid,
new BadProcessInfo(now, shortMsg, longMsg, stackTrace));
mProcessCrashTimes.remove(app.info.processName, app.uid);
}
app.bad = true;
app.removed = true;
// Don't let services in this process be restarted and potentially
// annoy the user repeatedly. Unless it is persistent, since those
// processes run critical code.
// 移除进程中的所有服务
mService.mProcessList.removeProcessLocked(app, false, tryAgain, "crash");
// 恢复顶部的activity
mService.mAtmInternal.resumeTopActivities(false /* scheduleIdle */);
if (!showBackground) {
return false;
}
}
mService.mAtmInternal.resumeTopActivities(false /* scheduleIdle */);
} else {
// 不是一分钟内连续崩溃
final int affectedTaskId = mService.mAtmInternal.finishTopCrashedActivities(
app.getWindowProcessController(), reason);
if (data != null) {
data.taskId = affectedTaskId;
}
if (data != null && crashTimePersistent != null
&& now < crashTimePersistent + ProcessList.MIN_CRASH_INTERVAL) {
data.repeating = true;
}
}
if (data != null && tryAgain) {
data.isRestartableForService = true;
}
// If the crashing process is what we consider to be the "home process" and it has been
// replaced by a third-party app, clear the package preferred activities from packages
// with a home activity running in the process to prevent a repeatedly crashing app
// from blocking the user to manually clear the list.
final WindowProcessController proc = app.getWindowProcessController();
final WindowProcessController homeProc = mService.mAtmInternal.getHomeProcess();
if (proc == homeProc && proc.hasActivities()
&& (((ProcessRecord) homeProc.mOwner).info.flags & FLAG_SYSTEM) == 0) {
proc.clearPackagePreferredForHomeActivities();
}
if (!app.isolated) {
// XXX Can't keep track of crash times for isolated processes,
// because they don't have a persistent identity.
mProcessCrashTimes.put(app.info.processName, app.uid, now);
mProcessCrashTimesPersistent.put(app.info.processName, app.uid, now);
}
// 如果 app的crashHandler存在,则交给其处理
if (app.crashHandler != null) mService.mHandler.post(app.crashHandler);
return true;
}
职责:
记录崩溃之间
增加 ServiceRecord
中crashCount数量
是否是一分钟内连续崩溃如果是两次连续崩溃小于一分钟,则认为是频繁崩溃。
- 调用
onHandleAppCrash
方法 - 如果不是persistent进程,则不再重启,除非用户主动触发
- 移除进程中的所有服务,且不再重启
- 恢复栈顶的activity
- 不是连续崩溃,则记录崩溃受影响的taskid
- 如果 app的crashHandler存在,则交给其处理
ATMS.onHandleAppCrash()
ActivityTaskManagerService.java
@Override
public void onHandleAppCrash(WindowProcessController wpc) {
synchronized (mGlobalLock) {
mRootActivityContainer.handleAppCrash(wpc);
}
}
//RootActivityContainer.java
void handleAppCrash(WindowProcessController app) {
// 遍历所有的ActivityDisplay
for (int displayNdx = mActivityDisplays.size() - 1; displayNdx >= 0; --displayNdx) {
final ActivityDisplay display = mActivityDisplays.get(displayNdx);
// 遍历ActivityDisplay中管理的所有 ActivityStack
for (int stackNdx = display.getChildCount() - 1; stackNdx >= 0; --stackNdx) {
// 获取activity stack对象
final ActivityStack stack = display.getChildAt(stackNdx);
stack.handleAppCrash(app);
}
}
}
>ActivityStack.java
void handleAppCrash(WindowProcessController app) {
// 循环ActivityStack中管理的 TaskRecord
for (int taskNdx = mTaskHistory.size() - 1; taskNdx >= 0; --taskNdx) {
// 得到 TaskRecord中管理的所有 ActivityRecord集合
final ArrayList<ActivityRecord> activities = mTaskHistory.get(taskNdx).mActivities;
// 遍历 ActivityRecord集合,得到每一个 ActivityRecord对象
for (int activityNdx = activities.size() - 1; activityNdx >= 0; --activityNdx) {
final ActivityRecord r = activities.get(activityNdx);
// 如果是崩溃的进程,则销毁activity
if (r.app == app) {
// Force the destroy to skip right to removal.
r.app = null;
//
getDisplay().mDisplayContent.prepareAppTransition(
TRANSIT_CRASHING_ACTIVITY_CLOSE, false /* alwaysKeepCurrent */);
// finish销毁当前activity
finishCurrentActivityLocked(r, FINISH_IMMEDIATELY, false,
"handleAppCrashedLocked");
}
}
}
}
职责:
- 遍历所有ActivityDisplay,得到ActivityDisplay对象 display
- 然后在遍历display中的所有 ActivityStack对象,stack
- 再遍历 stack中所有的 TaskRecord对象,record
- 在遍历record中的所有 ActivityRecord对象,如果属于崩溃进程则销毁它
3.5 小结
AMS端在收到App的崩溃后,大概流程如下:
- 把崩溃信息通过 DBS 服务,写入到Dropbox文件中。dropbox支持错误类型:
crash、wtf、anr
- 停止崩溃进程接收广播;增加ServiceRecord中的crashcount数;销毁所有的activies;
- 弹出崩溃对话框,等待用户选择 3.1. 如果选择重新启动,则从最近任务列表中找到崩溃进程,再次拉起 3.2. 如果选择强制退出,则杀掉app,进入kill流程 3.3. 如果选择显示应用信息,则启动系统页面的intent,打开应用详情页面
回到3.3.1
中,当处理完 makeAppCrashingLocked()
方法逻辑后,会通过AMS的 mUiHandler
发送 SHOW_ERROR_UI_MSG 弹出 对话框。
四、 mUiHandler发送 SHOW_ERROR_UI_MSG
AMS.java
final class UiHandler extends Handler {
public UiHandler() {
super(com.android.server.UiThread.get().getLooper(), null, true);
}
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
case SHOW_ERROR_UI_MSG: {
mAppErrors.handleShowAppErrorUi(msg);
ensureBootCompleted();
} break;
// ...
4.1 handleShowAppErrorUi()
AppErrors.java
void handleShowAppErrorUi(Message msg) {
AppErrorDialog.Data data = (AppErrorDialog.Data) msg.obj;
boolean showBackground = Settings.Secure.getInt(mContext.getContentResolver(),
Settings.Secure.ANR_SHOW_BACKGROUND, 0) != 0;
AppErrorDialog dialogToShow = null;
final String packageName;
final int userId;
synchronized (mService) {
// 获取进程信息
final ProcessRecord proc = data.proc;
final AppErrorResult res = data.result;
if (proc == null) {
Slog.e(TAG, "handleShowAppErrorUi: proc is null");
return;
}
packageName = proc.info.packageName;
userId = proc.userId;
// 如果已经有对话框,则不再弹出
if (proc.crashDialog != null) {
Slog.e(TAG, "App already has crash dialog: " + proc);
if (res != null) {
res.set(AppErrorDialog.ALREADY_SHOWING);
}
return;
}
boolean isBackground = (UserHandle.getAppId(proc.uid)
>= Process.FIRST_APPLICATION_UID
&& proc.pid != MY_PID);
for (int profileId : mService.mUserController.getCurrentProfileIds()) {
isBackground &= (userId != profileId);
}
if (isBackground && !showBackground) {
Slog.w(TAG, "Skipping crash dialog of " + proc + ": background");
if (res != null) {
res.set(AppErrorDialog.BACKGROUND_USER);
}
return;
}
final boolean showFirstCrash = Settings.Global.getInt(
mContext.getContentResolver(),
Settings.Global.SHOW_FIRST_CRASH_DIALOG, 0) != 0;
final boolean showFirstCrashDevOption = Settings.Secure.getIntForUser(
mContext.getContentResolver(),
Settings.Secure.SHOW_FIRST_CRASH_DIALOG_DEV_OPTION,
0,
mService.mUserController.getCurrentUserId()) != 0;
final boolean crashSilenced = mAppsNotReportingCrashes != null &&
mAppsNotReportingCrashes.contains(proc.info.packageName);
if ((mService.mAtmInternal.canShowErrorDialogs() || showBackground)
&& !crashSilenced
&& (showFirstCrash || showFirstCrashDevOption || data.repeating)) {
// 创建对话框,5分钟超时等待,超时后自动关闭
proc.crashDialog = dialogToShow = new AppErrorDialog(mContext, mService, data);
} else {
// The device is asleep, so just pretend that the user
// saw a crash dialog and hit "force quit".
if (res != null) {
res.set(AppErrorDialog.CANT_SHOW);
}
}
}
// If we've created a crash dialog, show it without the lock held
if (dialogToShow != null) {
Slog.i(TAG, "Showing crash dialog for package " + packageName + " u" + userId);
// 弹出对话框
dialogToShow.show();
}
}
逻辑很简单,就是获取进程的信息,并且展示错误对话框。5分钟用户没有选择,则自动关闭。
- 如果用户选择应用信息,则展示应用的运行信息
- 如果选择关闭应用,则执行杀应用流程
- 如果不选择,5分钟后自动关闭。
在1和3中都还没有执行杀应用流程,回顾2.2中的流程,在finally语句中都会走杀进程逻辑。
finally {
// Try everything to make sure this process goes away.
// 最终关闭kill掉进程
Process.killProcess(Process.myPid());
System.exit(10);
}
4.2 Process.killProcess()
public static final void killProcess(int pid) {
sendSignal(pid, SIGNAL_KILL);
}
public static final native void sendSignal(int pid, int signal);
给指定的进程发送一个 SIGNAL_KILL
信号。具体的杀进程流程,后续再单独分析。
至此,应用进程已经被杀死,但是还没完。因为system server进程中有注册Binder服务的死亡监听。当App进程死亡后,会回调到AMS 的死亡监听中,此时还需要处理Binder死亡通知回调逻辑。
五、Binder服务死亡通知
那么,AMS是什么时候注册死亡通知的呢?
还记得在创建进程的过程中,ActivityThread会调用AMS的 attachApplication()
, 内部会调用到 attachApplicationLocked()
方法。在这里注册的Binder的死亡通知。
5.1 AMS.attachApplicationLocked()
@GuardedBy("this")
private final boolean attachApplicationLocked(IApplicationThread thread,
int pid, int callingUid, long startSeq) {
//...
try {
AppDeathRecipient adr = new AppDeathRecipient(
app, pid, thread);
thread.asBinder().linkToDeath(adr, 0);
app.deathRecipient = adr;
} catch (RemoteException e) {
app.resetPackageList(mProcessStats);
mProcessList.startProcessLocked(app,
new HostingRecord("link fail", processName));
return false;
}
//...
}
当有binder服务死亡,会调用 AppDeathRecipient 的 binderDied()
方法:
5.2 AppDeathRecipient.binderDied()
AMS.java
@Override
public void binderDied() {
if (DEBUG_ALL) Slog.v(
TAG, "Death received in " + this
+ " for thread " + mAppThread.asBinder());
synchronized(ActivityManagerService.this) {
appDiedLocked(mApp, mPid, mAppThread, true);
}
}
5.2.1 appDiedLocked()
@GuardedBy("this")
final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,
boolean fromBinderDied) {
// First check if this ProcessRecord is actually active for the pid.
synchronized (mPidsSelfLocked) {
ProcessRecord curProc = mPidsSelfLocked.get(pid);
if (curProc != app) {
Slog.w(TAG, "Spurious death for " + app + ", curProc for " + pid + ": " + curProc);
return;
}
}
BatteryStatsImpl stats = mBatteryStatsService.getActiveStatistics();
synchronized (stats) {
stats.noteProcessDiedLocked(app.info.uid, pid);
}
// 如果没有被杀,再次杀app
if (!app.killed) {
if (!fromBinderDied) {
killProcessQuiet(pid);
}
ProcessList.killProcessGroup(app.uid, pid);
app.killed = true;
}
// Clean up already done if the process has been re-started.
if (app.pid == pid && app.thread != null &&
app.thread.asBinder() == thread.asBinder()) {
boolean doLowMem = app.getActiveInstrumentation() == null;
boolean doOomAdj = doLowMem;
if (!app.killedByAm) {
reportUidInfoMessageLocked(TAG,
"Process " + app.processName + " (pid " + pid + ") has died: "
+ ProcessList.makeOomAdjString(app.setAdj, true) + " "
+ ProcessList.makeProcStateString(app.setProcState), app.info.uid);
mAllowLowerMemLevel = true;
} else {
// Note that we always want to do oom adj to update our state with the
// new number of procs.
mAllowLowerMemLevel = false;
doLowMem = false;
}
// 调用 handleAppDiedLocked
handleAppDiedLocked(app, false, true);
if (doOomAdj) {
updateOomAdjLocked(OomAdjuster.OOM_ADJ_REASON_PROCESS_END);
}
if (doLowMem) {
doLowMemReportIfNeededLocked(app);
}
}
//...
}
5.2.2 handleAppDiedLocked()
final void handleAppDiedLocked(ProcessRecord app,
boolean restarting, boolean allowRestart) {
int pid = app.pid;
// 清理service、broadcastreveiver、contentprovider等信息
boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1,
false /*replacingPid*/);
if (!kept && !restarting) {
// 移除崩溃进程在AMS中的代表 ProcessRecord
removeLruProcessLocked(app);
if (pid > 0) {
ProcessList.remove(pid);
}
}
if (mProfileData.getProfileProc() == app) {
clearProfilerLocked();
}
// 继续调用 atm的 handleAppDied
mAtmInternal.handleAppDied(app.getWindowProcessController(), restarting, () -> {
Slog.w(TAG, "Crash of app " + app.processName
+ " running instrumentation " + app.getActiveInstrumentation().mClass);
Bundle info = new Bundle();
info.putString("shortMsg", "Process crashed.");
finishInstrumentationLocked(app, Activity.RESULT_CANCELED, info);
});
}
- 清理service、broadcastreveiver、contentprovider等信息
- 移除移除崩溃进程 ProcessRecord
- 继续调用 atm的 handleAppDied
5.3 cleanUpApplicationRecordLocked()
该方法清理崩溃进程相关的所有信息。
final boolean cleanUpApplicationRecordLocked(ProcessRecord app,
boolean restarting, boolean allowRestart, int index, boolean replacingPid) {
if (index >= 0) {
removeLruProcessLocked(app);
ProcessList.remove(app.pid);
}
mProcessesToGc.remove(app);
mPendingPssProcesses.remove(app);
ProcessList.abortNextPssTime(app.procStateMemTracker);
// 关闭所有已经打开的对话框: crash、anr、wait等
// Dismiss any open dialogs.
if (app.crashDialog != null && !app.forceCrashReport) {
app.crashDialog.dismiss();
app.crashDialog = null;
}
if (app.anrDialog != null) {
app.anrDialog.dismiss();
app.anrDialog = null;
}
if (app.waitDialog != null) {
app.waitDialog.dismiss();
app.waitDialog = null;
}
app.setCrashing(false);
app.setNotResponding(false);
app.resetPackageList(mProcessStats);
app.unlinkDeathRecipient();
app.makeInactive(mProcessStats);
app.waitingToKill = null;
app.forcingToImportant = null;
updateProcessForegroundLocked(app, false, 0, false);
app.setHasForegroundActivities(false);
app.hasShownUi = false;
app.treatLikeActivity = false;
app.hasAboveClient = false;
app.setHasClientActivities(false);
// 移除所有service 信息
mServices.killServicesLocked(app, allowRestart);
boolean restart = false;
// 移除所有的contentprovicer信息
// Remove published content providers.
for (int i = app.pubProviders.size() - 1; i >= 0; i--) {
ContentProviderRecord cpr = app.pubProviders.valueAt(i);
final boolean always = app.bad || !allowRestart;
boolean inLaunching = removeDyingProviderLocked(app, cpr, always);
if ((inLaunching || always) && cpr.hasConnectionOrHandle()) {
// We left the provider in the launching list, need to
// restart it.
restart = true;
}
cpr.provider = null;
cpr.setProcess(null);
}
app.pubProviders.clear();
// Take care of any launching providers waiting for this process.
if (cleanupAppInLaunchingProvidersLocked(app, false)) {
restart = true;
}
// Unregister from connected content providers.
if (!app.conProviders.isEmpty()) {
for (int i = app.conProviders.size() - 1; i >= 0; i--) {
ContentProviderConnection conn = app.conProviders.get(i);
conn.provider.connections.remove(conn);
stopAssociationLocked(app.uid, app.processName, conn.provider.uid,
conn.provider.appInfo.longVersionCode, conn.provider.name,
conn.provider.info.processName);
}
app.conProviders.clear();
}
// At this point there may be remaining entries in mLaunchingProviders
// where we were the only one waiting, so they are no longer of use.
// Look for these and clean up if found.
// XXX Commented out for now. Trying to figure out a way to reproduce
// the actual situation to identify what is actually going on.
if (false) {
for (int i = mLaunchingProviders.size() - 1; i >= 0; i--) {
ContentProviderRecord cpr = mLaunchingProviders.get(i);
if (cpr.connections.size() <= 0 && !cpr.hasExternalProcessHandles()) {
synchronized (cpr) {
cpr.launchingApp = null;
cpr.notifyAll();
}
}
}
}
//移除所有的广播信息
skipCurrentReceiverLocked(app);
// Unregister any receivers.
for (int i = app.receivers.size() - 1; i >= 0; i--) {
removeReceiverLocked(app.receivers.valueAt(i));
}
app.receivers.clear();
//清理App所有的备份 信息
// If the app is undergoing backup, tell the backup manager about it
final BackupRecord backupTarget = mBackupTargets.get(app.userId);
if (backupTarget != null && app.pid == backupTarget.app.pid) {
if (DEBUG_BACKUP || DEBUG_CLEANUP) Slog.d(TAG_CLEANUP, "App "
+ backupTarget.appInfo + " died during backup");
mHandler.post(new Runnable() {
@Override
public void run(){
try {
IBackupManager bm = IBackupManager.Stub.asInterface(
ServiceManager.getService(Context.BACKUP_SERVICE));
bm.agentDisconnectedForUser(app.userId, app.info.packageName);
} catch (RemoteException e) {
// can't happen; backup manager is local
}
}
});
}
for (int i = mPendingProcessChanges.size() - 1; i >= 0; i--) {
ProcessChangeItem item = mPendingProcessChanges.get(i);
if (app.pid > 0 && item.pid == app.pid) {
mPendingProcessChanges.remove(i);
mAvailProcessChanges.add(item);
}
}
mUiHandler.obtainMessage(DISPATCH_PROCESS_DIED_UI_MSG, app.pid, app.info.uid,
null).sendToTarget();
// If the caller is restarting this app, then leave it in its
// current lists and let the caller take care of it.
if (restarting) {
return false;
}
if (!app.isPersistent() || app.isolated) {
if (DEBUG_PROCESSES || DEBUG_CLEANUP) Slog.v(TAG_CLEANUP,
"Removing non-persistent process during cleanup: " + app);
if (!replacingPid) {
mProcessList.removeProcessNameLocked(app.processName, app.uid, app);
}
mAtmInternal.clearHeavyWeightProcessIfEquals(app.getWindowProcessController());
} else if (!app.removed) {
// This app is persistent, so we need to keep its record around.
// If it is not already on the pending app list, add it there
// and start a new process for it.
if (mPersistentStartingProcesses.indexOf(app) < 0) {
mPersistentStartingProcesses.add(app);
restart = true;
}
}
if ((DEBUG_PROCESSES || DEBUG_CLEANUP) && mProcessesOnHold.contains(app)) Slog.v(
TAG_CLEANUP, "Clean-up removing on hold: " + app);
mProcessesOnHold.remove(app);
mAtmInternal.onCleanUpApplicationRecord(app.getWindowProcessController());
if (restart && !app.isolated) {
// We have components that still need to be running in the
// process, so re-launch it.
if (index < 0) {
ProcessList.remove(app.pid);
}
mProcessList.addProcessNameLocked(app);
app.pendingStart = false;
mProcessList.startProcessLocked(app,
new HostingRecord("restart", app.processName));
return true;
} else if (app.pid > 0 && app.pid != MY_PID) {
// Goodbye!
mPidsSelfLocked.remove(app);
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
mBatteryStatsService.noteProcessFinish(app.processName, app.info.uid);
if (app.isolated) {
mBatteryStatsService.removeIsolatedUid(app.uid, app.info.uid);
}
app.setPid(0);
}
return false;
}
职责:
清理所有跟崩溃进程相关的service、provider、receiver等信息。
5.4 atms.handleAppDied()
ActivityTaskManagerService.java
@HotPath(caller = HotPath.PROCESS_CHANGE)
@Override
public void handleAppDied(WindowProcessController wpc, boolean restarting,
Runnable finishInstrumentationCallback) {
synchronized (mGlobalLockWithoutBoost) {
// Remove this application's activities from active lists.
// 清理activities相关信息
boolean hasVisibleActivities = mRootActivityContainer.handleAppDied(wpc);
wpc.clearRecentTasks();
wpc.clearActivities();
if (wpc.isInstrumenting()) {
finishInstrumentationCallback.run();
}
if (!restarting && hasVisibleActivities) {
mWindowManager.deferSurfaceLayout();
try {
if (!mRootActivityContainer.resumeFocusedStacksTopActivities()) {
// If there was nothing to resume, and we are not already restarting
// this process, but there is a visible activity that is hosted by the
// process...then make sure all visible activities are running, taking
// care of restarting this process.
// 确保恢复顶部的activity
mRootActivityContainer.ensureActivitiesVisible(null, 0,
!PRESERVE_WINDOWS);
}
} finally {
// windows相关
mWindowManager.continueSurfaceLayout();
}
}
}
}
- 清理activities相关信息
- 确保恢复顶部的activity
- 更新windows相关信息
至此,Binder死亡通知后的处理流程也基本走完,App的整个java crash流程也宣告结束了。
小结
当App发生崩溃后,除了弹出对话框,发送kill命令杀掉自身后。AMS还会收到App进程的Binder服务死亡通知,只有当走完Binder的 binderDied()流程后,整个崩溃流程才算真正结束。