Android: 移植VPN相关的kernel 内容

发表于 2010-08-09 | 分类于未分类

关于Vpn的支持，android代码在2.6.29的code上是支持的，但是2.6.27的就不支持。所以今天backport了一下。这里做一下记录。

最新的代码在 http://android.git.kernel.org/?p=kernel/common.git;a=summary

git://android.git.kernel.org/kernel/common.git

git checkout origin/android-goldfish-2.6.29 -b android-goldfish-2.6.29

这个分支才有最新的东西。

这几条commit是关于VPN(pptp, l2tp)的：不过这是到2010年8月9日截止的，以后新的就要自己加了。

c8706d4199cbbe86f370c55e9b84f94e79101a48
c50311620326bf4515e1e5aa4f85bbb816852701
7620ea508ae5fc623da4fc8ded8c8e10e65196b3
d89050258f0133ae56d586dd6d7345d473c9a216
f7f6469023c8c704157f9932a7639b70936d44b6

打上去以后，到kernel menuconfig里面把几个Config都打开。

我这里就只有：

CONFIG_NET_KEY

CONFIG_NET_IPIP

CONFIG_INET_ESP

CONFIG_INET_TUNNEL

CONFIG_TUN

CONFIG_PPP_DEFLATE

CONFIG_PPP_BSDCOMP

CONFIG_PPP_MPPE

CONFIG_PPPOE

CONFIG_PPPOL2TP

CONFIG_PPPOLAC

CONFIG_PPPOPNS

用menuconfig把这些打开以后就可以了（还有一些选项他们打开以后会自动打开)

我这里有一个config文件，如果少了什么就自己配置了。 http://thinksrc.com/media/agdrempibG9ncg0LEgVNZWRpYRiy0gkM/default_config?a=download

Basic test after file system hack

发表于 2010-08-08 | 分类于未分类

This post only for record the how to basic test after a file system hack

Dan Carpenter wrote:

On filesystem tests, you should always test them before submitting them. Everyone can create a small filesystem like this:
dd if=/dev/zero of=block bs=1M count=4000
mkfs.btrfs block mkdir mnt
sudo mount -o loop -t btrfs block mnt/ 
Maybe untar a kernel on it or something...
Not sure.
But if you mess up a filesystem people get annoyed. :P

kchecker - check your kernel code

发表于 2010-07-30 | 分类于未分类

kchecker is a Static source code check tools. It can find some buggy kmalloc usage and

You can get this tool under http://repo.or.cz/w/smatch.git

git clone git://repo.or.cz/smatch.git

cd smatch
make -j4
sudo make install

After then, you can find kchecker under ./smatch_scripts/kchecker.
Lets try it.

x@x-desktop:$kchecker  --spammy  fs/ubifs/journal.c
  CHK     include/linux/version.h
make[1]: “include/asm-arm/mach-types.h” is latest
  CHK     include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
  CALL    scripts/checksyscalls.sh
  CHECK   fs/ubifs/journal.c
fs/ubifs/journal.c +233 reserve_space(118) warn: inconsistent returns mutex:&wbuf->io_mutex: locked (136,202,215) unlocked (169,184,219,233)
  CC      fs/ubifs/journal.o

This script find some wrong usage of mutex. Lets have another try:

x@x-desktop$ kchecker --spammy drivers/video/mxc/mxc_ipuv3_fb.c

  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-arm
  CALL    scripts/checksyscalls.sh
  CHECK   drivers/video/mxc/mxc_ipuv3_fb.c
drivers/video/mxc/mxc_ipuv3_fb.c +875 mxcfb_ioctl(181) warn: possible memory leak of 'mem'
drivers/video/mxc/mxc_ipuv3_fb.c +992 mxcfb_ioctl(298) warn: 'sem:&mxc_fbi->alpha_flip_sem' is sometimes locked here and sometimes unlocked.
drivers/video/mxc/mxc_ipuv3_fb.c +1541 mxcfb_remove(5) warn: variable dereferenced before check 'fbi'
  CC      drivers/video/mxc/mxc_ipuv3_fb.o

This time, it found some possible memory leak. And some buggy lock usage.

Have fun with this tool...

BTW, This link http://smatch.sourceforge.net/ shows how to check at compile all kernel code.

Linux中的原子操作以及IA-32架构的原子操作

发表于 2010-07-22 | 分类于未分类

原子操作，通常有几种实现方式，加锁和使用CPU指令中实现的原子变量操作。

在Linux Kernel中，通常使用atomic_t系列的操作如atomic_set(), atomic_read()来操作。这里先看看几个平台的实现，然后再记录一下X86下面atomic变量的实现。

atomic_t 的结构是这样的：

typedef struct { volatile int counter; } atomic_t;

在这个结构体中只有一个加上了volatile的int的计数器。基本上大部分的平台下的atomc_read()实现都是
#define atomic_read(v)	((v)->counter)
这样一条来实现的。这是因为不管是X86的还是MIPS体系的架构下，都不会对于内存读写作重新排序。因为这个操作有一条指针解引用，所以它就是对内存进行操作。

但是对于原子变量的写操作却有比较大的不同：
比如

arm V6以上：

#define atomic_read(v)    ((v)->counter)
static inline void atomic_set(atomic_t *v, int i)
{
	unsigned long tmp;

	__asm__ __volatile__("@ atomic_setn"
"1:	ldrex	%0, [%1]n"
"	strex	%0, %2, [%1]n"
"	teq	%0, #0n"
"	bne	1b"
	: "=&r" (tmp)
	: "r" (&v->counter), "r" (i)
	: "cc");
}

是用一个strex, ldrex（互斥读内存和互斥写内存）,来写内存如果设置不成功，就继续重新读一次再写一次。虽然这样的操作看起来不是很高效，但是却简化了CPU的设计，在其他方面的提升可以抵消这样操作的损耗。

剩下的实现都#define atomic_set(v,i) ((v)->counter = (i))这样实现的。

看似和读一样，都只是一条简单的赋值操作。但是如果要实现真正原子，只需要保证编译器老老实实的把这条读写操作变成对于内存的读写就可以了。而加上volatile就是这个意思，防止编译器把这个变量放在某个寄存器里面进行优化，而是每次都编译成load, store, 或者其他的内存操作指令。

因为在这些构架中，对于内存的读写都是可以保证原子的。但是对于IA-32有一个例外，就是IA-32构架下只有4字节对其的内存操作才是原子性的。除非是你手动写汇编。这种不是4字节对齐的内存访问需要在手动汇编或者一些结构体定义的时候特别注意。

下面记录一下IA-32下的原子操作的实现。

在IA-32下，可以通过在汇编指令前面加入#LOCK来保证这个操作是原子的。但是这个指令实现的方式却是把BUS锁住的方式实现的，所以会很影响系统的吞吐量。其他的指令也会造成类似的效果：

* CMPXCHG指令，就是传说中的比较并交换。
* 设置B位到TSS寄存器中，这样保证任务切换的时候不会出现切换任务的情况。
* 更新段选择器的时候。
* 更新Page Direcotry, page table 的时候
* 相应中断的时候，中断控制器传输中断向量的时候。

Linux Driver DMA allocation - not using GFP_DMA

发表于 2010-06-04 | 分类于未分类

Today, I remove our DMA_ZONE in Linux Kernel, that provides 48MiB memory using by system & application, MUCH faster then before.

Why DMA_ZONE, I think this because of the usage of dma_alloc_coherent(), try to allocate continues memory, in our old driver, they use GFP_DMA to call this function, This will cause this function allocation memory from DMA_ZONE or failed. even you using GFP_DMA|GFP_KERNEL.

My change almost change GFP_DMA to GFP_KERNEL or just remove GFP_DMA.

The DMA_ZONE is a legacy design, it was originally design to support 24-bit ISA bus or some broken PCI device that can't access full 32-bit memory. Because DMA_ZONE always in low memory address, so if your system is not like this, just avoid this DMA_ZONE.

about DMA memory allocation:

[1] http://www.linuxjournal.com/article/6930?page=0,2

[2] LDD3 Chapter 15 , 4.3 section.

Linux I18N 笔记

发表于 2010-05-26 | 分类于未分类

今天看了James Su在2007年的北京google 开发者日讲的I18N国际化的视频，这是youku的地址：http://v.youku.com/v_show/id_XNTczNzcyOA==.html

算是解开了几个基本对于国际化的问题，这里作一个笔记吧，记在本子上老是会找不到。

UCS vs UNICODE

其实UCS 和 UNICODE 现在是一样的编码，但是在最初制定的时候是由两个组织制订的，后来两个组织发现一个世界不需要两套编码，就想办法合并了一下，所以就兼容了。

UCS里面有 UCS-2(16位) UCS-4 (32位) 之类的

UCS 其实只是一个字符的表格，规定哪个页那行哪列是哪个字符，而UNICODE则一些其他的东西。这里有一个链接八卦了一些历史，http://blog.chinaunix.net/u1/49491/showart_2209875.html。

说一下外码，和内码。

以前一直不太理解什么叫做外码，其实外码就是存储（比如存放在硬盘上）传输（比如网络传输）用的字符编码，比如UTF-8，ASCII，还有恶心的GB2312之类的。在函数里面，通常这个会简称为mbs (multi byte string)。外码是不定长的，比如UTF-8就是有可能是1个字节，最多也可能到4个字节。

而内码呢，就是在内存中的使用的字符存储，因为外码是不定长的，所以在编程和处理的时候就会比较麻烦，比如你想往前3个字符，如果在ASCII下面就是3个字节，而如果是中文，那么就是不是3个字节了。这个指针就不好操作了。所以会有内码，内码是定长的，比如你用wchar_t, wint_t就可以了。英文种的缩写是 wcs(wide char string)。

有一定需要注意的，就是只有定义了__STDC_ISO_10646__这个宏的以后才是使用的unicode。

Linux下面有一个locale的东西，这个东西就是控制是使用哪一种语言的，其中包含一批宏。你在shell下输入locale就能看到这些宏的值了。比如我的就是

kzj@t61:~$ locale
LANG=zh_CN.UTF-8
LANGUAGE=zh_CN.UTF-8
LC_CTYPE="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=

我现在的外码是使用的zh_CN.UTF-8。

如果是有两个码，就需要转换了。你从文件中读入一个字符串到内存的时候。你需要把这些字符串都处理成内码来使用。

Linux下面有两种方法。

一种是使用 mbstowcs(3) 一类的函数来进行转换，这样转换过的函数就可以使用wprintf一系列的函数了。这种方法严重依赖于你的locale设置，如果你读入的文本的编码和你的locale不一样的话，乱码就和你招手了，因为它只是把locale设置的语言转换成你的内码。

在操作所有这些和locale有关的函数之前，都要调用setlocale，不然可能会错误。

第二种使用iconv(3)的函数进行编码转换，这个函数不依赖于你的locale设置，随便指定编码转换，比如你可以用这个函数实现的简体和繁体之间的转换功能。

常见的错误：

1. 直接在源代码里面写中文。。。建议任何有不是ASCII的内容都用gettext作翻译。

2. 使用UCS-2是不够的，最好使用UCS-4，因为UCS-2是16位的，就算是汉字只能存2完多，最新发的标准都有5万多个汉字了。所以不够，或者使用UTF-8比较好。

3. 直接操作外码，记不记得以前，WIN 98？的时候经常有软件出现半个汉字的情况？所以要处理汉字之类的，总是先转换成内码再操作。

experices on a linux touchscreen panel driver

发表于 2010-05-24 | 分类于未分类

"Write a good touch feel touchscreen panel driver"

Recently, I had re-factory three time of our touch panel driver.Every time re-factory with clear aim.

Our product use a resistance touchscreen panel(about touchscreen panel, see [http://en.wikipedia.org/wiki/Touchscreen]), the touch screen use an adc(Analog-to-digital converter) convert the resistance change to the digital number, then these value should be calibration by algorithm like (tslib, [http://tslib.berlios.de/]) does. After that, these value is be a human readable coordinate. The event should report we user touch the touchscreen, and should report a touch up event notice service layer the user leave the touchscreen.

In linux, all of these use by api in (linux/input.h), such as :

input_report_abs();

input_event();

input_sync();

Our first version of driver using a kthread, a infinite while loop;

void kthread_func() {
        while(1) {
                poll_event(x, y, touch); //get a event from adc driver.
                input_report_abs(x);    //here report the coordinate of touch event
                input_event(touch);     //here report the state of touch, 1
                                        //means touch down, 0 means touch up.
                input_sync();                   //sync means a complete event.
                sleep(delay);
        }
}

This design is very simple and clear, But it have some shortcoming.

1. the kthread is hard to control, it hard stop by the other thread, until it stop by itself. It will be problem we you want to suspend/resume driver.

2. There is a touchscreen use case, If you move quickly you finger on the touchscreen, maybe there is one or two event is touch leave the screen in the middle of this move. This should be ignore by driver. Excellent product like iPad, iPhone touchscreen have this feather. But in this implantation, the touch leave event will report.

The next version of this driver, we change few things.

1. use a single_threaded_workqueue instead of kthread, so we can stop the thread when the driver suspend.

2. in the workquque, it only report the down event and the abs change of the toucscreen, the touch up event is sent by a hrtimer. every time the work is running, it will adjust the timer. If there is no event coming, the timer is not modify by anyone, so it expired, the callback is called, it will send a touch up event.

pseudocode is:

   timer_func()
   {
        input_report_abs(ABS_X, x);     //x,y coordinate
        input_report_abs(ABS_Y, y);
        input_report_abs(ABS_PRESSURE, 0); //pressure
        input_event(EV_KEY, BTN_TOUCH, 0); //touch up event
        input_sync();
   }

   workqueue_func()
   {

        poll_event(x, y, touch); //get a event from adc driver.
        start_hrtimer(timer, timeout_of_touch_up); //normally 2-10
                                                   //times of delay
        if (touch) {
                input_report_abs(x);    //here report the coordinate of touch event
                input_event(touch);     //here report the state of touch, 1
                                        //means touch down, 0 means touch up.
                input_sync();                   //sync means a complete event.
        }
        queue_self_to_delayed_workqueue(delay);
    }

In this implemention, above all issues are fixed.

1. use a delayed workqueue, we can cancel the queued work to stop the wokequeue looping. It is very easy to control.

2. if there is one or two event up in the figer move(figer gesture), the timeout will not expired, so the move is continue until user really want to leave their figure off the touchscreen.

I'm very happen that our touchscreen have a better touch feelling like the top level of the industry such as iPad. :)

git: 如何用git-am来合并git format-patch生成的一系列的patch.

发表于 2010-04-21 | 分类于未分类

这篇文章主要介绍一下git-am 和 format-patch 的使用。因为在git使用当中，
会有很多时候别人（供应商或者其他的开发人员）发过来一系列的patch，这些patch通常的是类似这样的名字：

0001--JFFS2-community-fix-with-not-use-OOB.patch
0002--Community-patch-for-Fix-mount-error-in.patch
0003--partial-low-interrupt-latency-mode-for-ARM113.patch
0004--for-the-global-I-cache-invalidation-ARM11.patch
0005--1-arm-Add-more-cache-memory-types-macr.patch
0006--2-Port-imx-3.3.0-release-to-2.6.28.patch
0007--3-Add-MX25-support.patch
0008--Move-asm-arch-headers-to-linux-inc-dir.patch
0009--1-regulator-allow-search-by-regulator.patch

里面包含了提交的日志，作者，日期等信息。你想做的是把这些patch引入到你的
代码库中，最好是也可以把日志也引入进来，方便以后维护用。传统的打patch方式是

patch -p1 < 0001--JFFS2-community-fix-with-not-use-OOB.patch

这样来打patch，但是这样会把这些有用的信息丢失。

由于这些patch显然是用git format-patch来生成的，所以用git的工具应该就可以很好的做好。

git-am 就是作这件事情。

在使用git-am之前，你要首先git am –abort 一次，来放弃掉以前的am信息，这样才可以进行一次全新的am。
不然会遇到这样的错误。
.git/rebase-apply still exists but mbox given.

git-am 可以一次合并一个文件，或者一个目录下所有的patch，或者你的邮箱目录下的patch.

下面举两个例子：

你现在有一个code base： small-src, 你的patch文件放在~/patch/0001-trival-patch.patch

cd small-src
git-am ~/patch/0001-trival-patch.patch

如果成功patch上去，你就可以去喝杯茶了。

如果失败了， git 会提示错误，比如：

error: patch failed: android/mediascanner.cpp:452
error: android/mediascanner.cpp: patch does not apply

这样你就需要先看看patch，然后改改错误的这个文件，让这个patch能够patch上去。

你有一堆patch，名字是上面提到的那一堆patch，你把他们放在~/patch-set/目录下（路径随意）

cd opencore
git am ~/patch-set/*.patch

(这里git就会按照文件名的顺序一次am这些patch）
如果一切顺利，你所有的patch都OK了，你又Lucky了。

不过不顺利的时候十有八九，如果git am中间遇到了patch,am就会停到打这个
patch的地方，告诉你是哪个patch打不上去。

比如我现在有一个文件file,有两个patch.
file 的内容是

the text

more text

两个patch分别是：

0001-add-line.patch:

From 48869ccbced494e05738090afa5a54f2a261df0f Mon Sep 17 00:00:00 2001
From: zhangjiejing <zhangjiejing@zhangjiejing-desktop.(none)>
Date: Thu, 22 Apr 2010 13:04:34 +0800
Subject: [PATCH 1/2] add line

---
 file |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/file b/file
index 067780e..685f0fa 100644
--- a/file
+++ b/file
@@ -3,3 +3,5 @@ file:
 some text

 more text
+
+add line
--
1.6.3.3

0002-change-line.patch:

From f756e1b3a87c216b7e0afea9d15badd033171578 Mon Sep 17 00:00:00 2001
From: zhangjiejing <zhangjiejing@zhangjiejing-desktop.(none)>
Date: Thu, 22 Apr 2010 13:05:19 +0800
Subject: [PATCH 2/2] change line

---
 file |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/file b/file
index 685f0fa..7af7852 100644
--- a/file
+++ b/file
@@ -1,6 +1,6 @@
 file:

-some text
+Change line text

 more text

--
1.6.3.3

运行
git am *.patch

来merge这些patch，报错， Patch failed at 0001 add line这样我们看0001这
个patch,原来patch需要的是some text, 而file里面是the text, 所以我们用编
辑器把这行改成some text,

vi file
git apply 0001-add-line.patch
git add file
git am --resolved

在解决完冲突以后，比如用git add来让git知道你已经解决完冲突了。

如果你发现这个冲突是无法解决的，要撤销整个am的东西。可以运行git am –abort，
如果你想只是忽略这一个patch，可以运行git am –skip来跳过这个patch.

Linux Kernel and Android 休眠与唤醒(中文版)

发表于 2010-04-18 | 分类于未分类

Table of Contents

简介
国际化
版本信息
对于休眠(suspend)的简单介绍
Linux Suspend 的流程
Android 休眠(suspend)

简介

休眠/唤醒在嵌入式Linux中是非常重要的部分,嵌入式设备尽可能的进入休眠状态来延长电池的续航时间.这篇文章就详细介绍一下Linux中休眠/唤醒是如何工作的, 还有Android中如何把这部分和Linux的机制联系起来的.

国际化

English Version: link
中文版: link

作者: zhangjiejing <kzjeef#gmail.com> Date: 2010-04-07, http://www.thinksrc.com

版本信息

Linux Kernel: v2.6.28
Android: v2.0

对于休眠(suspend)的简单介绍

在Linux中,休眠主要分三个主要的步骤:

冻结用户态进程和内核态任务
调用注册的设备的suspend的回调函数
- 顺序是按照注册顺序
休眠核心设备和使CPU进入休眠态冻结进程是内核把进程列表中所有的进程的状态都设置为停止,并且保存下所有进程的上下文. 当这些进程被解冻的时候,他们是不知道自己被冻结过的,只是简单的继续执行.如何让Linux进入休眠呢?用户可以通过读写sys文件/sys /power/state 是实现控制系统进入休眠. 比如
```
# echo standby > /sys/power/state
```
命令系统进入休眠. 也可以使用
```
# cat /sys/power/state
```
来得到内核支持哪几种休眠方式.

Linux Suspend 的流程

准备, 冻结进程

当进入到suspend_prepare()中以后, 它会给suspend分配一个虚拟终端来输出信息, 然后广播一个系统要进入suspend的Notify, 关闭掉用户态的helper进程, 然后一次调用suspend_freeze_processes()冻结所有的进程, 这里会保存所有进程当前的状态, 也许有一些进程会拒绝进入冻结状态, 当有这样的进程存在的时候, 会导致冻结失败,此函数就会放弃冻结进程,并且解冻刚才冻结的所有进程.

/**
 *      suspend_prepare - Do prep work before entering low-power state.
 *
 *      This is common code that is called for each state that we're entering.
 *      Run suspend notifiers, allocate a console and stop all processes.
 */
static int suspend_prepare(void)
{
        int error;
        unsigned int free_pages;

        if (!suspend_ops || !suspend_ops->enter)
                return -EPERM;

        pm_prepare_console();

        error = pm_notifier_call_chain(PM_SUSPEND_PREPARE);
        if (error)
                goto Finish;

        error = usermodehelper_disable();
        if (error)
                goto Finish;

        if (suspend_freeze_processes()) {
                error = -EAGAIN;
                goto Thaw;
        }

        free_pages = global_page_state(NR_FREE_PAGES);
        if (free_pages < FREE_PAGE_NUMBER) {
                pr_debug("PM: free some memoryn");
                shrink_all_memory(FREE_PAGE_NUMBER - free_pages);
                if (nr_free_pages() < FREE_PAGE_NUMBER) {
                        error = -ENOMEM;
                        printk(KERN_ERR "PM: No enough memoryn");
                }
        }
        if (!error)
                return 0;

 Thaw:
        suspend_thaw_processes();
        usermodehelper_enable();
 Finish:
        pm_notifier_call_chain(PM_POST_SUSPEND);
        pm_restore_console();
        return error;
}

让外设进入休眠

现在, 所有的进程(也包括workqueue/kthread) 都已经停止了, 内核态人物有可能在停止的时候握有一些信号量, 所以如果这时候在外设里面去解锁这个信号量有可能会发生死锁, 所以在外设的suspend()函数里面作lock/unlock锁要非常小心,这里建议设计的时候就不要在suspend()里面等待锁. 而且因为suspend的时候,有一些Log是无法输出的,所以一旦出现问题,非常难调试.

然后kernel在这里会尝试释放一些内存.

最后会调用suspend_devices_and_enter()来把所有的外设休眠, 在这个函数中, 如果平台注册了suspend_pos(通常是在板级定义中定义和注册), 这里就会调用 suspend_ops->begin(), 然后driver/base/power/main.c 中的 device_suspend()->dpm_suspend() 会被调用,他们会依次调用驱动的suspend() 回调来休眠掉所有的设备.

当所有的设备休眠以后, suspend_ops->prepare()会被调用, 这个函数通常会作一些准备工作来让板机进入休眠. 接下来Linux,在多核的CPU中的非启动CPU会被关掉, 通过注释看到是避免这些其他的CPU造成race condion,接下来的以后只有一个CPU在运行了.

suspend_ops 是板级的电源管理操作, 通常注册在文件 arch/xxx/mach-xxx/pm.c 中.

接下来, suspend_enter()会被调用, 这个函数会关闭arch irq, 调用 device_power_down(), 它会调用suspend_late()函数, 这个函数是系统真正进入休眠最后调用的函数, 通常会在这个函数中作最后的检查. 如果检查没问题, 接下来休眠所有的系统设备和总线, 并且调用 suspend_pos->enter() 来使CPU进入省电状态. 这时候,就已经休眠了.代码的执行也就停在这里了.

/**
 *      suspend_devices_and_enter - suspend devices and enter the desired system
 *                                  sleep state.
 *      @state:           state to enter
 */
int suspend_devices_and_enter(suspend_state_t state)
{
        int error, ftrace_save;

        if (!suspend_ops)
                return -ENOSYS;

        if (suspend_ops->begin) {
                error = suspend_ops->begin(state);
                if (error)
                        goto Close;
        }
        suspend_console();
        ftrace_save = __ftrace_enabled_save();
        suspend_test_start();
        error = device_suspend(PMSG_SUSPEND);
        if (error) {
                printk(KERN_ERR "PM: Some devices failed to suspendn");
                goto Recover_platform;
        }
        suspend_test_finish("suspend devices");
        if (suspend_test(TEST_DEVICES))
                goto Recover_platform;

        if (suspend_ops->prepare) {
                error = suspend_ops->prepare();
                if (error)
                        goto Resume_devices;
        }

        if (suspend_test(TEST_PLATFORM))
                goto Finish;

        error = disable_nonboot_cpus();
        if (!error && !suspend_test(TEST_CPUS))
                suspend_enter(state);

        enable_nonboot_cpus();
 Finish:
        if (suspend_ops->finish)
                suspend_ops->finish();
 Resume_devices:
        suspend_test_start();
        device_resume(PMSG_RESUME);
        suspend_test_finish("resume devices");
        __ftrace_enabled_restore(ftrace_save);
        resume_console();
 Close:
        if (suspend_ops->end)
                suspend_ops->end();
        return error;

 Recover_platform:
        if (suspend_ops->recover)
                suspend_ops->recover();
        goto Resume_devices;
}

Resume

如果在休眠中系统被中断或者其他事件唤醒, 接下来的代码就会开始执行, 这个唤醒的顺序是和休眠的循序相反的,所以系统设备和总线会首先唤醒,使能系统中断, 使能休眠时候停止掉的非启动CPU, 以及调用suspend_ops->finish(), 而且在suspend_devices_and_enter()函数中也会继续唤醒每个设备,使能虚拟终端, 最后调用 suspend_ops->end().

在返回到enter_state()函数中的, 当 suspend_devices_and_enter() 返回以后, 外设已经唤醒了, 但是进程和任务都还是冻结状态, 这里会调用suspend_finish()来解冻这些进程和任务, 而且发出Notify来表示系统已经从suspend状态退出, 唤醒终端.

到这里, 所有的休眠和唤醒就已经完毕了, 系统继续运行了.

Android 休眠(suspend)

在一个打过android补丁的内核中, state_store()函数会走另外一条路,会进入到request_suspend_state()中, 这个文件在earlysuspend.c中. 这些功能都是android系统加的, 后面会对earlysuspend和late resume 进行介绍.

涉及到的文件:

linux_source/kernel/power/main.c
linux_source/kernel/power/earlysuspend.c
linux_source/kernel/power/wakelock.c

特性介绍

Early Suspend

Early suspend 是android 引进的一种机制, 这种机制在上游备受争议,这里不做评论. 这个机制作用在关闭显示的时候, 在这个时候, 一些和显示有关的设备, 比如LCD背光, 比如重力感应器, 触摸屏, 这些设备都会关掉, 但是系统可能还是在运行状态(这时候还有wake lock)进行任务的处理, 例如在扫描 SD卡上的文件等. 在嵌入式设备中, 背光是一个很大的电源消耗,所以 android会加入这样一种机制.

Late Resume

Late Resume 是和suspend 配套的一种机制, 是在内核唤醒完毕开始执行的. 主要就是唤醒在Early Suspend的时候休眠的设备.

Wake Lock

Wake Lock 在Android的电源管理系统中扮演一个核心的角色. Wake Lock是一种锁的机制, 只要有人拿着这个锁, 系统就无法进入休眠, 可以被用户态程序和内核获得. 这个锁可以是有超时的或者是没有超时的, 超时的锁会在时间过去以后自动解锁. 如果没有锁了或者超时了, 内核就会启动休眠的那套机制来进入休眠.

Android Suspend

当用户写入mem 或者 standby到 /sys/power/state中的时候, state_store()会被调用, 然后Android会在这里调用 request_suspend_state() 而标准的Linux会在这里进入enter_state()这个函数. 如果请求的是休眠, 那么early_suspend这个workqueue就会被调用,并且进入early_suspend状态.

void request_suspend_state(suspend_state_t new_state)
{
        unsigned long irqflags;
        int old_sleep;

        spin_lock_irqsave(&state_lock, irqflags);
        old_sleep = state & SUSPEND_REQUESTED;
        if (debug_mask & DEBUG_USER_STATE) {
                struct timespec ts;
                struct rtc_time tm;
                getnstimeofday(&ts);
                rtc_time_to_tm(ts.tv_sec, &tm);
                pr_info("request_suspend_state: %s (%d->%d) at %lld "
                        "(%d-%02d-%02d %02d:%02d:%02d.%09lu UTC)n",
                        new_state != PM_SUSPEND_ON ? "sleep" : "wakeup",
                        requested_suspend_state, new_state,
                        ktime_to_ns(ktime_get()),
                        tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
                        tm.tm_hour, tm.tm_min, tm.tm_sec, ts.tv_nsec);
        }
        if (!old_sleep && new_state != PM_SUSPEND_ON) {
                state |= SUSPEND_REQUESTED;
                queue_work(suspend_work_queue, &early_suspend_work);
        } else if (old_sleep && new_state == PM_SUSPEND_ON) {
                state &= ~SUSPEND_REQUESTED;
                wake_lock(&main_wake_lock);
                queue_work(suspend_work_queue, &late_resume_work);
        }
        requested_suspend_state = new_state;
        spin_unlock_irqrestore(&state_lock, irqflags);
}

Early Suspend

在early_suspend()函数中, 首先会检查现在请求的状态还是否是suspend, 来防止suspend的请求会在这个时候取消掉(因为这个时候用户进程还在运行),如果需要退出, 就简单的退出了. 如果没有, 这个函数就会把early suspend中注册的一系列的回调都调用一次, 然后同步文件系统, 然后放弃掉 main_wake_lock, 这个wake lock是一个没有超时的锁,如果这个锁不释放, 那么系统就无法进入休眠.

   static void early_suspend(struct work_struct *work)
{
        struct early_suspend *pos;
        unsigned long irqflags;
        int abort = 0;

        mutex_lock(&early_suspend_lock);
        spin_lock_irqsave(&state_lock, irqflags);
        if (state == SUSPEND_REQUESTED)
                state |= SUSPENDED;
        else
                abort = 1;
        spin_unlock_irqrestore(&state_lock, irqflags);

        if (abort) {
                if (debug_mask & DEBUG_SUSPEND)
                        pr_info("early_suspend: abort, state %dn", state);
                mutex_unlock(&early_suspend_lock);
                goto abort;
        }

        if (debug_mask & DEBUG_SUSPEND)
                pr_info("early_suspend: call handlersn");
        list_for_each_entry(pos, &early_suspend_handlers, link) {
                if (pos->suspend != NULL)
                        pos->suspend(pos);
        }
        mutex_unlock(&early_suspend_lock);

        if (debug_mask & DEBUG_SUSPEND)
                pr_info("early_suspend: syncn");

        sys_sync();
abort:
        spin_lock_irqsave(&state_lock, irqflags);
        if (state == SUSPEND_REQUESTED_AND_SUSPENDED)
                wake_unlock(&main_wake_lock);
        spin_unlock_irqrestore(&state_lock, irqflags);
}

Late Resume

当所有的唤醒已经结束以后, 用户进程都已经开始运行了, 唤醒通常会是以下的几种原因:

来电

如果是来电, 那么Modem会通过发送命令给rild来让rild通知WindowManager有来电响应,这样就会远程调用PowerManagerService来写"on" 到 /sys/power/state 来执行late resume的设备, 比如点亮屏幕等.

用户按键用户按键事件会送到WindowManager中, WindowManager会处理这些按键事件,按键分为几种情况, 如果案件不是唤醒键(能够唤醒系统的按键) 那么WindowManager会主动放弃wakeLock来使系统进入再次休眠, 如果按键是唤醒键,那么WindowManger就会调用PowerManagerService中的接口来执行 Late Resume.
Late Resume 会依次唤醒前面调用了Early Suspend的设备.

static void late_resume(struct work_struct *work)
{
        struct early_suspend *pos;
        unsigned long irqflags;
        int abort = 0;

        mutex_lock(&early_suspend_lock);
        spin_lock_irqsave(&state_lock, irqflags);
        if (state == SUSPENDED)
                state &= ~SUSPENDED;
        else
                abort = 1;
        spin_unlock_irqrestore(&state_lock, irqflags);

        if (abort) {
                if (debug_mask & DEBUG_SUSPEND)
                        pr_info("late_resume: abort, state %dn", state);
                goto abort;
        }
        if (debug_mask & DEBUG_SUSPEND)
                pr_info("late_resume: call handlersn");
        list_for_each_entry_reverse(pos, &early_suspend_handlers, link)
                if (pos->resume != NULL)
                        pos->resume(pos);
        if (debug_mask & DEBUG_SUSPEND)
                pr_info("late_resume: donen");
abort:
        mutex_unlock(&early_suspend_lock);
}

Wake Lock

我们接下来看一看wake lock的机制是怎么运行和起作用的, 主要关注 wakelock.c文件就可以了.

wake lock 有加锁和解锁两种状态, 加锁的方式有两种, 一种是永久的锁住, 这样的锁除非显示的放开, 是不会解锁的, 所以这种锁的使用是非常小心的. 第二种是超时锁, 这种锁会锁定系统唤醒一段时间, 如果这个时间过去了, 这个锁会自动解除.

锁有两种类型:

WAKE_LOCK_SUSPEND 这种锁会防止系统进入睡眠
WAKE_LOCK_IDLE 这种锁不会影响系统的休眠, 作用我不是很清楚.

在wake lock中, 会有3个地方让系统直接开始suspend(), 分别是:

在wake_unlock()中, 如果发现解锁以后没有任何其他的wake lock了, 就开始休眠
在定时器都到时间以后, 定时器的回调函数会查看是否有其他的wake lock, 如果没有, 就在这里让系统进入睡眠.
在wake_lock() 中, 对一个wake lock加锁以后, 会再次检查一下有没有锁, 我想这里的检查是没有必要的, 更好的方法是使加锁的这个操作原子化, 而不是繁冗的检查. 而且这样的检查也有可能漏掉.

调试wake lock的时候，有一个很有用的方法就是

echo 15 > /sys/module/wakelock/parameter/debug_mask

这样wakelock的驱动会把每次的wakelock操作都打印在console上，对于调试为什么suspend不下去这类的问题很有用。

Suspend

当wake_lock 运行 suspend()以后, 在wakelock.c的suspend()函数会被调用,这个函数首先sync文件系统,然后调用pm_suspend(request_suspend_state),接下来pm_suspend()就会调用enter_state()来进入Linux的休眠流程..

      static void suspend(struct work_struct *work)
{
        int ret;
        int entry_event_num;

        if (has_wake_lock(WAKE_LOCK_SUSPEND)) {
                if (debug_mask & DEBUG_SUSPEND)
                        pr_info("suspend: abort suspendn");
                return;
        }

        entry_event_num = current_event_num;
        sys_sync();
        if (debug_mask & DEBUG_SUSPEND)
                pr_info("suspend: enter suspendn");
        ret = pm_suspend(requested_suspend_state);
        if (current_event_num == entry_event_num) {
                wake_lock_timeout(&unknown_wakeup, HZ / 2);
        }
}

Android于标准Linux休眠的区别

pm_suspend() 虽然会调用enter_state()来进入标准的Linux休眠流程,但是还是有一些区别:

当进入冻结进程的时候, android首先会检查有没有wake lock,如果没有, 才会停止这些进程, 因为在开始suspend和冻结进程期间有可能有人申请了 wake lock,如果是这样, 冻结进程会被中断.
在suspend_late()中, 会最后检查一次有没有wake lock, 这有可能是某种快速申请wake lock,并且快速释放这个锁的进程导致的,如果有这种情况, 这里会返回错误, 整个suspend就会全部放弃.如果pm_suspend()成功了,LOG的输出可以通过在kernel cmd里面增加 "no_console_suspend" 来看到suspend和resume过程中的log输出。

Config Linux DMA zone memory

发表于 2010-04-11 | 分类于未分类

This value is about DMA zone size of Linux DMA zone size, this value can set in CONFIG_DMA_ZONE_SIZE, This value is set by kernel .config, vi your .config file to see how much, how kernel use this value?

The arch memory.h header file will use your define function named arch_adjust_zones() to adjust global value zone_size[] to adjust you custom memory zone infomation.

files:

arch/arm/include/asm/memory.h : arch_adjust_zones

arch/arm/mach-xxx/include/mach/memory.h arch_adjust_zones

简介

国际化

版本信息

对于休眠(suspend)的简单介绍

Linux Suspend 的流程

相关的文件:

准备, 冻结进程

让外设进入休眠

Resume

Android 休眠(suspend)

涉及到的文件:

特性介绍

Early Suspend

Late Resume

Wake Lock

Android Suspend

Early Suspend

Late Resume

Wake Lock

Suspend

Android于标准Linux休眠的区别