Alan Cox on Code Auditing

Alan Cox recently gave a talk at linux.conf.au 2003 on code auditing, something he has been doing a lot of recently. His work was mainly concerned with Linux device drivers, but the lessons can be universally applied. Below is an edited form of my notes, I am not sure if the original slides are available.

Why audit code?

We all make mistakes, and are bad at seeing our own errors

Good practice changes over time and (especially with Linux) API changes over time

Security Auditing

Fixing up "unforeseen" problems (but they shouldn't have been). For example, "hotplug - who cares?" then cardbus comes, "64 bit - who cares?" then AMD Hammer, IA64.

Alan's first rule of Open Source

"Criticism is Infinitely Scalable"

C traps

Maths overflows

Signed overflow undefined. This is one that has just hit the Linux Kernel when gcc changed its behaviour. Added by Ian -- this is a post from Linus about the problem

 From torvalds@penguin.transmeta.com Tue Feb  6 09:36:41 2001
 From: torvalds@penguin.transmeta.com (Linus Torvalds)
 Newsgroups: comp.os.linux.development.apps
 Subject: Re: Overflow detection in gcc-produced code?

 In article <slrn97tsgd.pu3.BobT@linus.cs.queensu.ca>,
 Bob Tennent <rdtennent@hotmail.com> wrote:
 >Is there any way to detect that an overflow has occurred in x86 integer        
 >arithmetic code produced by gcc?                                               

        unsigned long x, y, sum;
        int overflow = 0;

        sum = x+y;
        if (sum < x)
                overflow = 1;

 Works portably in C (but only with "unsigned" types that are guaranteed
 to have twos-complement behaviour and well-defined overflow semantics). 
 And good compilers will notice that "overflow" is really just the carry
 flag, and optimize it. I don't remember if gcc counts as "good" in this
 case, but I doubt it ;)

 For signed integer arithmetic, you're basically screwed.  A compiler is
 in theory allowed to do anything at all for signed overflows, so you
 cannot portably test the result at all: you have to detect the overflow
 before the operation happens.  In practice, of course, all x86 compilers
 will just result in the "proper" overflow value, but you still have to
 test whether an overflow occurred. 

 At some point it probably becomes easier to do it with inline asm. 
 Especially if it needs to be efficient.  One (inefficient) way might be
 to do the calculation in "long long" and do a

        long long x, y, sum;
        in toverflow = 0;

        sum = x + y;
        if (sum != (long)sum)
                overflow = 1;

 but if you look at the actual code it generates, you'll be really sad.

                Linus

Drivers accepting values into kernel space are a major source of overflows (deliberate or not), e.g., x * y (x or y untrusted) overflows and invariably buffer overflow ensues. Another common problem is very carefully adding 1 to null terminate a string, user passes in a length of 0xfffffff which, +1 = zero and a buffer overrun.

Be very careful with structures as they are padded by the compiler. __attribute__((packed)) should be used with caution because on some architectures (e.g., IA64) these end up with unaligned access faults, which are slow and annoying.

Endianness, of course. Most basic porting problem but still is constantly missed.

Object Lifespan

For any given object you should know

For example, sit down on some kernels and plug and unplug a USB device constantly. Count how many times you can do it before you crash. This is simply because objects are not kept track of, or because of subtle race conditions with knowing when you can finish with your objects. Alan suggests of the many ways to tackle this sort of problem, reference counting (as in network code) is the best way.

Make common creation functions -- otherwise people forget to fill in bits of the object, it gets passed to another function that assumes something has been initalized properly which hasn't. Slab poisoning on Linux helps you with this sort of problem, and when it was introduced found a huge number of problems.

Think about the fields you are using -- the 2.0 alpha floppy driver only worked every other month, as the timer value was 32 bit! You might end up with software that literally only works once in a blue moon.

Make sure you know all resources freed on all paths -- small leak * year == big leak

portability

Use u8, u16, u32 etc. Short/Long suck!

Use unsigned long for Memory and I/O resources. This makes sure you are portable to other arch's like Sparc etc.

Check that ioremap memory is used with readb/writeb, not dereferenced directly and not used with memcpy/memset

dma

Beware code using virt_to_bus with DMA -- not on non-pc hardware

DMA not allowed for vmalloced memory, high pages, the stack - archs like ppc64 have a virtual stack. Also, you end up with bugs where DMA wasn't stopped and would overwrite your stack constantly (and while you are in the debugger). Once this happened with a TV driver, and someone could not figure out what was happening until they noticed a large number of repeating bytes on their stack and decided to decode it, and got a CNN logo ...

There is a DMA API now, use it.

PCI

PCI is hotplug now, 2.5 important

PCI is message based, interrupts and PCI data transfer are async and you never quite know what is coming next. You need to turn off interrupts, free resources and free irq before freeing resources else you fail the test above about always knowing when you are complete.

PCI devices may have 64 bit addressing (see unsigned long note above)

Resources are Finite

System calls - don't allocate user controlled amounts of memory! Don't queue undefined number of buffers of user data. These are fatal istakes.

Interrupt handing - make sure time in handler is bound. Beware of infinite memory allocations.

The stack - it's small. Typically 7k, 4k in interrupts. Use it wisely. This means no large recursive calls.

== Deadlocks ==

Spinlocks are not recursive!

Don't ever sleep with locks : 2.5 checks for this. copy to/from user space sleeps, but in 2.4 it almost always works anyway. Not so in 2.5

Don't add mdelay() -- one driver did this in an interrupt handler and could have potentially shut the machine down for 20 minutes.

IRQ/syscall deadlocks : can't disable irq holding spinlocks taken by irq handler. dead lock!

Termination Order

Ensure DMA is finished before freeing -- otherwise you end up overwriting data.

Use complete_and_exit with threads

Auditing Code Requires Understanding It

Kernel-doc (like java doc) puts comments right into code. It is good, use it. But must be kept up to date.

Submitting changes

Check your formatting. Seems silly but will stop patches getting taken.

IA64wiki: alan cox on code auditing (last edited 2004-04-26 02:04:57 by PeterChubb)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.