[PATCH 0/9] ia64: VIRT_CPU_ACCOUNTING (accurate cpu time accounting)

From: Hidetoshi Seto <seto.hidetoshi_at_jp.fujitsu.com>
Date: 2007-10-16 23:31:50
Hi, ia64 folks.

Here is a set of patch to implement VIRT_CPU_ACCOUNTING for ia64,
which enable us to use more accurate cpu time accounting.

[1/9] ia64_add_config_virt_cpu_accounting.patch
[2/9] ia64_expand_ia64_cputime_h.patch
[3/9] ia64_cputime_to_nsec.patch
[4/9] ia64_self_update_process_times.patch
[5/9] ia64_acct_vars.patch
[6/9] ia64_acct_gate_on_switch.patch
[7/9] ia64_acct_gate_on_entry.patch
[8/9] ia64_acct_gate_on_leave.patch
[9/9] ia64_acct_get_vtime.patch

The cpu time accounting is a mechanism to determine how long
the cpus are used for particular purpose, and also how long a
thread uses cpu - values indicated in stime or utime.

Now the cpu time accounting on ia64 system (and many other archs)
is based on sampling at the time of tick(timer interrupt).
If a thread is running in kernel mode at an timer interrupt, then
the accounting increments the stime of the thread considering that
the thread have consume cpu time in kernel mode from last tick to
present. If the thread was swapper, then the accounting consider
the cpu was idle from last tick.

This assumption that thread did not change from last tick is not
always true, mostly false in modern hi-speed machines.
If the stime of a thread has value 100, people tend to imagine that
"this thread ran 100/HZ sec in kernel mode", however, the true
meaning of the value is "this thread interrupted by tick 100 times
while it is running in kernel mode" in fact, and thats all.

+----+----+----+----+----+----+-> time (+ = tick)
........SSUUUSSSUUSSUSS........ thread1 ([stime,utime] = [1,2])
.SSUUUUSSS.SSUUSUSS...SSUUUSSS. thread2 ([stime,utime] = [1,2])
........SUS...SUS..SUS......... thread3 ([stime,utime] = [1,2])
(Note that all of these 3 threads really uses cpu time more in
 system, not in user.)

Therefore, more accurate cpu time accounting is required to know
how long the thread actually uses cpus, what purpose takes time
of particular cpu, and so on.

The VIRT_CPU_ACCOUNTING is an item of kernel config, which s390
and powerpc arch have.  By turning this config on, these archs
change the mechanism of cpu time accounting from tick-sampling
based one to state-transition based one.

The state-transition based accounting is done by checking time
(cycle counter in processor) at every state-transition point,
such as entrance/exit of kernel, interrupt, softirq etc.
The difference between point to point is the actual time consumed
during in the state. There is no doubt about that this value is
more accurate than that of tick-based accounting.

My patches here port this VIRT_CPU_ACCOUNTING from these IBM archs.
Some performance impact is expected, but as far as my brief tests,
it looks like nothing much to worry about.

I'd appreciate it if you could try this option and send me your
feedback. Especially idea for optimization would be welcomed.


To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Oct 16 23:28:16 2007

This archive was generated by hypermail 2.1.8 : 2007-10-16 23:28:30 EST