I searched online for answers, but none of them are good enough for me --- most of them says it's equal to running queue, but some observations on an I/O busy NFS server disagree with that. So I checked source code.
TLDR;
load average can be considered as a sum of total running processes (the run queue size) and interruptible process over specific time period. This applies to at least version 2.6.x and 3.x (I only checked source code of those two versions)
BTW, the uptime command is only read and output the data in /proc/loadavg for load average.
(A little bit) Long but official explanation:
copied from Linux kernel source code: of v3.14-rc6
TLDR;
load average can be considered as a sum of total running processes (the run queue size) and interruptible process over specific time period. This applies to at least version 2.6.x and 3.x (I only checked source code of those two versions)
BTW, the uptime command is only read and output the data in /proc/loadavg for load average.
(A little bit) Long but official explanation:
copied from Linux kernel source code: of v3.14-rc6
/*
* Global load-average calculations
*
* We take a distributed and async approach to calculating the global load-avg
* in order to minimize overhead.
*
* The global load average is an exponentially decaying average of nr_running +
* nr_uninterruptible.
*
* Once every LOAD_FREQ:
*
* nr_active = 0;
* for_each_possible_cpu(cpu)
* nr_active += cpu_of(cpu)->nr_running + cpu_of(cpu)->nr_uninterruptible;
*
* avenrun[n] = avenrun[0] * exp_n + nr_active * (1 - exp_n)
*
* Due to a number of reasons the above turns in the mess below:
*
* - for_each_possible_cpu() is prohibitively expensive on machines with
* serious number of cpus, therefore we need to take a distributed approach
* to calculating nr_active.
*
* \Sum_i x_i(t) = \Sum_i x_i(t) - x_i(t_0) | x_i(t_0) := 0
* = \Sum_i { \Sum_j=1 x_i(t_j) - x_i(t_j-1) }
*
* So assuming nr_active := 0 when we start out -- true per definition, we
* can simply take per-cpu deltas and fold those into a global accumulate
* to obtain the same result. See calc_load_fold_active().
*
* Furthermore, in order to avoid synchronizing all per-cpu delta folding
* across the machine, we assume 10 ticks is sufficient time for every
* cpu to have completed this task.
*
* This places an upper-bound on the IRQ-off latency of the machine. Then
* again, being late doesn't loose the delta, just wrecks the sample.
*
* - cpu_rq()->nr_uninterruptible isn't accurately tracked per-cpu because
* this would add another cross-cpu cacheline miss and atomic operation
* to the wakeup path. Instead we increment on whatever cpu the task ran
* when it went into uninterruptible state and decrement on whatever cpu
* did the wakeup. This means that only the sum of nr_uninterruptible over
* all cpus yields the correct result.
*
* This covers the NO_HZ=n code, for extra head-aches, see the comment below.
*/
Calculation is done by below formula:
active = sum of running and interruptible processes
then:
active = active > 0 ? active * FIXED_1 : 0;
avenrun[0] = calc_load(avenrun[0], EXP_1, active); /* 1 min loadavg */
avenrun[1] = calc_load(avenrun[1], EXP_5, active); /* 5 min loadavg */
avenrun[2] = calc_load(avenrun[2], EXP_15, active);
/* 10 min loadavg */
on 3.x:
static unsigned long
calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
load *= exp;
load += active * (FIXED_1 - exp);
load += 1UL << (FSHIFT - 1);
return load >> FSHIFT;
}
on 2.6.x:
static unsigned long
calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
load *= exp;
load += active * (FIXED_1 - exp);
return load >> FSHIFT;
}
and constants for both versions are defined as:
#define FSHIFT 11 /* nr of bits of precision */
#define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */
#define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */
#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */
#define EXP_5 2014 /* 1/exp(5sec/5min) */
#define EXP_15 2037 /* 1/exp(5sec/15min) */
Comments
Post a Comment