The V(R) I/O Scheduler

Introduction

In 1987, Geist and Daniel published a paper describing a new I/O scheduling algorithm they called V(R). The parameter R, a real in the range [0,1], denotes the penalty for a reverse seek (that is, changing the current head direction), such that the effective seek distance for reversing the head direction is the actual seek distance plus R times the size of the disk. The request with the nearest effective distance to the last request is chosen next, thus V(0) is SSTF (shortest seek time first), and V(1) is SCAN.

By tuning this parameter, one can obtain different scheduling behaviours on a continuum between SSTF, which typically provides good throughput but is particularly prone to starvation, and SCAN, which is inherently free from starvation. Geist shows that V(0.2) provides excellent performance; better than SCAN, SSTF and V(0.1).

Implementation

The behaviour of our V(R) implementation differs somewhat from the published algorithm. The R parameter (which we have called rev_penalty) has been redefined to be an integer in the range [1,∞) which is multiplied with the sector distance of a head-reversing seek to get the effective distance. Thus, rev_penalty=1 means that forward and reverse seeks are equally expensive and the nearest of the two will be chosen (i.e. SSTF). As it increases, reversing the head direction becomes more expensive, until pure SCAN behaviour is reached. In practice, the range of rev_penalty is limited by the size of an int variable, so rev_penalty=0 can be used as an alias for SCAN.

V(R) is similar to, and based on, Linux's deadline I/O scheduler. It implements request deadlines which try to limit starvation, but provide no hard guarantee of request latency. It also merges and batches requests in a similar manner, but unlike deadline, read and write requests are issued together. This means there is no preference to reads over writes. Thus, higher throughput can be achieved at the cost of interactive performance. Deadlines can be disabled by setting the appropriate tunables to 0 (see below).

Applications

We envisage two main applications of V(R). SCAN and SSTF are commonly used algorithms for benchmarking and comparison purposes. V(R) can provide pure implementations of these algorithms in a single scheduler by setting rev_penalty appropriately and disabling deadlines. Thus, it is particularly well suited to Linux-based research work.

Secondly, V(R) could be used for situations where maximal throughput it the primary concern, such as batch processing systems, where per-request latency is not important. By scheduling read and write requests together and setting sufficiently high deadlines, V(R) can theoretically produce a better schedule than deadline, AS or CFQ.

Getting V(R)

Patch for Linux 2.6.24+: http://www.cse.unsw.edu.au/~aaronc/iosched/patches/vr-iosched.patch

To build a V(R)-enabled kernel, set the option CONFIG_IOSCHED_VR=y. To enable it on a particular disk at run-time, execute:

echo vr > /sys/block/<device>/queue/scheduler

where <device> is the disk name (e.g. sda).

Note: this I/O scheduler is currently experimental and should not be used on production systems.

Tunables

V(R) exports 4 tunables, which can be read and written through the files in /sys/block/<device>/queue/iosched/.

Tunable name

Default value

Description

rev_penalty

10

Penalty for reversing head direction.

fifo_batch

16

Number of requests to issue before checking for expired requests.

sync_expire

500 ms

Deadline for synchronous requests.

async_expire

5 s

Deadline for asynchronous requests.

A rev_penalty of 1 is SSTF. As it increases, the scheduling behaviour approaches SCAN. rev_penalty=0 can be used as shorthand for SCAN.

fifo_batch is used to control the time between issuing expired requests. Lowering this value will result in tighter deadlines at the cost of throughput, while increasing it will improve throughput while increasing latency variation.

sync_expire and async_expire control the expiry times for synchronous and asynchronous requests, respectively. Reads or synchronous writes (i.e. writes using O_DIRECT or O_SYNC) are treated as synchronous requests; all others are considered asynchronous. Deadlines can be disabled entirely by setting sync_expire=async_expire=0.

Benchmarks

Coming soon.

Contact

Please send comments, questions and especially bug reports to Aaron Carroll <aaronc@gelato.unsw.edu.au>.

IA64wiki: IOScheduling/VRscheduler (last edited 2009-12-10 03:14:06 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.