Android's Watchdog - Kitkat

    Android's watchdog implementation stayed same through out most versions of Android until Kitkat. It had to be modified to support variable timeouts rather than the standard 1 minute timeout interval. Based on a commit e6f81cf1f69e0683f969238f921950befba8e6c3, this was done to support lengthy operations like application installations of multi giga bytes in size. Yet, this still supports the previous one minute based checks too. This is implemented by having the Watchdog thread post runnables to different threads (having a looper) associated with them. In the current implementation, these runnables are posted to Foreground Thread (android.fg), Ui Thread (android.ui),  Io Thread (android.io) and MainThread, Window Manager, Activity Manager, Package Manager, Power Manager.

    The runnable (HandlerChecker) is a generic implementation and is not specific to the type of the thread to which it it being posted (Io, Fg etc). It checks the status of requested Monitors, if any and the ability of the thread associated with the Handler to execute this Runnable. Each HandlerChecker supports a variable timeout and this ensures that the PackageManager's lengthy operations doesn't trigger a false positive. These runnables aren't queued to the MessageQueue instead posted at front of the queue (Handler.postAtFrontOfQueue). The thread is considered to be healthy as long as it is able to execute the runnable and requested Monitors return the callback (Monitor.monitor) within the specified timeout. There is a little optimization over here, the runnable isn't even posted to the thread if it doesn't have any Monitors to check and if its looper is idling (Looper.isIdling()). The purpose of this optimization is to reduce the context switch or an extra processing of a runnable when the thread is already in a healthy state.

    The main watchdog thread still sleeps based on the old interval of 30 seconds, timeout of which triggers the check of all handlers. So in case of a PackageManager service which doesn't respond for 10 minutes, watchdog would have checked the status 20 times and eventually kills the framework process after 10 minutes (rather than the earlier 1 minute interval).

   OEMs could use this feature to check the status of other custom Handler threads in the framework process by adding it to the watchdog using the API, Watchdog.getInstance().addThread(Handler thread, String name, long timeoutMillis). The only catch is that it can't be used for any interval less than 30 seconds as the Watchdog thread by itself checks based on 30 seconds intervals. So a handler thread with a requested timeout of 10 seconds wouldn't actually trigger a watchdog kill after 10 seconds, instead would timeout only after 30 seconds.

No comments: