The Reboot Loops of Galaxy Nexus

Last night when I was eating out with friends, my phone (Samsung Galaxy Nexus) restarted itself during use. Since it has been doing that for a while, I didn’t think too much of it. But on my way back home I noticed the phone would restart after just a few minutes of usage, sometimes even stuck in a “reboot loop” where it would restart before the boot sequence is completed.
昨天晚上和朋友出去吃饭的时候我的电话 (三星Galaxy Nexus) 用到一半自己重新启动了。因为这样子不止是一天两天了,也就没在意。但是回家的时候手机开始用几分钟就重新启动一次,有时候甚至会不停重启。

After searching for similar occurrences, several attempts to reproduce the problem (under the assumption of a misbehaving app), and detours involving safe mode and recovery mode, I noticed in a forum post about Galaxy Nexus tips on xda-developers mentioned finding and posting LAST_KMSG (the Linux kernel message log from last boot) for trouble shooting. Since I already had Android SDK installed, I pulled the message log using adb. Normal apps shouldn’t trigger reboots, so the kernel log will likely offer more clues as to where the problem is.
搜索相似情况之后,找了多次哪个程序是罪魁祸首之后,数次切换安全模式之后,我注意到 xda-developers 论坛上一条置顶消息中提到哪里可以找到 LAST_KMSG (Linux 内核最后一次启动时的消息日志)。因为我正好已经安装了 Android 软件开发包 ,我用了 adb 从设备上取得了消息日志。

Caution: Saving the log (or full device backup or bug report) in the same directory as adb and then updating Android SDK, may delete your saved log (or backup)! It’s good to backup, but save your backup somewhere else.
注意: 将消息日志(或错误报告,或设备备份)保存在 adb 所在的目录之后更新 Android 软件开发包,可能会删除你保存的内容! 备份是好事,但是请保存在别的地方。

While at first I was misled by the “no errors detected” message at the end, as well as the stack trace there, soon I understood that the problem has to do with not able to write internal storage leading to file system not able to maintain consistency, causing the reboot:
一开始被最后面的 “没有检测到错误” 消息和呼叫记录搞糊涂了的我,不久就发现实际的问题是设备无法写入内部存储空间,导致文件系统不能保证一致性,最后设备重启:

[   77.674102] mmcblk0: timed out sending r/w cmd command, card status 0x900
[   77.674316] mmcblk0: not retrying timeout
[   77.674621] end_request: I/O error, dev mmcblk0, sector 2391056
[   77.675018] Aborting journal on device mmcblk0p12-8.
[   78.695343] journal commit I/O error
[   78.695648] journal commit I/O error
[   78.752777] EXT4-fs error (device mmcblk0p12): ext4_journal_start_sb:296: Detected aborted journal
[   78.753051] EXT4-fs (mmcblk0p12): Remounting filesystem read-only
[   78.754119] Kernel panic - not syncing: EXT4-fs panic from previous error
[   78.754119] 
[   78.754272] Backtrace: 
[   78.754425] [] (dump_backtrace+0x0/0x10c) from [] (dump_stack+0x18/0x1c)
[   78.754516]  r6:c060fe88 r5:00000000 r4:c084e480 r3:00000000
[   78.754852] [] (dump_stack+0x0/0x1c) from [] (panic+0x80/0x1b0)
[   78.754943] [] (panic+0x0/0x1b0) from [] (__ext4_abort+0xd8/0xec)
[   78.755035]  r3:88020840 r2:00000000 r1:c4e7fc2c r0:c0759830
[   78.755340]  r7:c4e7e000
[   78.755462] [] (__ext4_abort+0x4/0xec) from [] (ext4_journal_start_sb+0x84/0x128)
[   78.755584]  r6:00000003 r5:c715fc00 r4:c7229e00
[   78.755798] [] (ext4_journal_start_sb+0x0/0x128) from [] (ext4_setattr+0x2b4/0x43c)
[   78.755920]  r8:00000000 r7:0000a068 r6:0000a068 r5:c4e7fdd8 r4:c76e5670
[   78.756256] [] (ext4_setattr+0x0/0x43c) from [] (notify_change+0x11c/0x2c4)
[   78.756347] [] (notify_change+0x0/0x2c4) from [] (do_truncate+0x7c/0x98)
[   78.756469] [] (do_truncate+0x0/0x98) from [] (do_last.isra.31+0x528/0x6a0)
[   78.756622]  r5:c4ea71e0 r4:c4e7fed8
[   78.756805] [] (do_last.isra.31+0x0/0x6a0) from [] (path_openat+0xc0/0x3ac)
[   78.756896] [] (path_openat+0x0/0x3ac) from [] (do_filp_open+0x34/0x88)
[   78.757019] [] (do_filp_open+0x0/0x88) from [] (do_sys_open+0xe4/0x17c)
[   78.757080]  r7:00000001 r6:00000017 r5:00020242 r4:c2a2c000
[   78.757385] [] (do_sys_open+0x0/0x17c) from [] (sys_open+0x28/0x2c)
[   78.757476] [] (sys_open+0x0/0x2c) from [] (ret_fast_syscall+0x0/0x30)
[   78.757568] CPU1: stopping
[   78.757629] Backtrace: 
[   78.757812] [] (dump_backtrace+0x0/0x10c) from [] (dump_stack+0x18/0x1c)
[   78.757873]  r6:c004f090 r5:c65fa000 r4:c07f9550 r3:00000000
[   78.758209] [] (dump_stack+0x0/0x1c) from [] (do_IPI+0x1c8/0x1fc)
[   78.758300] [] (do_IPI+0x0/0x1fc) from [] (__irq_usr+0x48/0xe0)
[   78.758361] Exception stack(0xc65fbfb0 to 0xc65fbff8)
[   78.758483] bfa0:                                     40c32928 40c308d0 00000001 00000000
[   78.758544] bfc0: 407c31a8 40cf3ad8 407c31a8 40c308d0 00013392 7fffffff 00013393 00000001
[   78.758666] bfe0: 514bd078 beac3960 407509f4 407508c4 80000010 ffffffff
[   78.758758] Rebooting in 5 seconds..
[   83.039154] Restarting Linux version 3.0.31-g6fb96c9 ( (gcc version 4.6.x-google 20120106 (prerelease) (GCC) ) #1 SMP PREEMPT Thu Jun 28 11:02:39 PDT 2012
[   83.039184] 

No errors detected
Boot info:
Last reset was warm software reset (PRM_RSTST=0x2)

(A “kernel panic” is the Linux equivalent of Windows “Stop Error” or “Bluescreen”/“BSoD”, where the operating system kernel has encountered a fault it’s not able to recover from. It’s usually caused by faulty hardware or drivers.
Linux的 “内核错误” 相当于 Windows 的 “蓝屏死机”,都是操作系统内核遇到了无法恢复的错误的情况。一般原因是硬件故障或是驱动程序有问题。 (Panic 是 “抓狂” 的意思。))

Since I can’t replace the internal storage, I sent it back to the carrier (Wind Mobile) today for repairs. It should take up to three weeks for (in-warranty) repair. (Unlike Dell’s out-of-warranty repair, which returned the computer in only one week!) The store rented me a LG Optimus 2X for $10, which is a pretty cool phone actually.
既然不能更换集成的闪存,最后只能今天送修。修理手机 (保修期内) 大概要三个星期。 (不像戴尔的保修期外修理……一个星期就退回来了!) 零售店$10租给我了一个LG Optimus 2X,算是个挺酷的手机呢。

This isn’t the first time I had a problem with failing storage. Back when my Nexus One still had a working power button, the memory card corrupted, leading to some interesting pictures. And my XPS laptop went through two failed hard drives.
这不是第一次遇到存储出问题的事情了。以前我的 Nexus One 还能开机的时候,SD卡出了问题,产生了几张“很有意思”的照片。而我的XPS笔记本电脑上硬盘也换了两次。

Considering the underlying causes of the reboot, the general advice (and my first reflex) of resetting the device would likely only result in temporary relief.
考虑到不断重启的根本原因,一般对这种情况的重置设备的建议 (也是我的第一反应) 可能只会带来一时的缓解,而不会解决真正问题吧。

This still leaves two other questions. Even though I only had the device for 4 months, it can take a long time for apps to respond, especially with Google Search. I always thought it may be because beginning in Android 4+, user-accessible data is stored together with system files and data, but could it have been a sign that the storage is failing? Also, would a file system check help any in this occasion?
最后还有两个疑问。虽然这个手机只用了四个月,应用程序经常要等好长时间才反应,特别是 Google 搜索特别慢。我一直觉得这可能是因为 Android 4+ 将用户可见的数据和系统文件放在一起,但是会不会实际上是闪存问题的前兆呢?另外,文件系统查错会不会对这种情况有任何帮助呢?