Alan Cox on Code Auditing
Alan Cox recently gave a talk at linux.conf.au 2003 on code auditing, something he has been doing a lot of recently. His work was mainly concerned with Linux device drivers, but the lessons can be universally applied. Below is an edited form of my notes, I am not sure if the original slides are available.
Why audit code?
We all make mistakes, and are bad at seeing our own errors
Good practice changes over time and (especially with Linux) API changes over time
Security Auditing
Fixing up "unforeseen" problems (but they shouldn't have been). For example, "hotplug - who cares?" then cardbus comes, "64 bit - who cares?" then AMD Hammer, IA64.
Alan's first rule of Open Source
"Criticism is Infinitely Scalable"
C traps
Maths overflows
Signed overflow undefined. This is one that has just hit the Linux Kernel when gcc changed its behaviour. Added by Ian -- this is a post from Linus about the problem
From torvalds@penguin.transmeta.com Tue Feb 6 09:36:41 2001
From: torvalds@penguin.transmeta.com (Linus Torvalds)
Newsgroups: comp.os.linux.development.apps
Subject: Re: Overflow detection in gcc-produced code?
In article <slrn97tsgd.pu3.BobT@linus.cs.queensu.ca>,
Bob Tennent <rdtennent@hotmail.com> wrote:
>Is there any way to detect that an overflow has occurred in x86 integer
>arithmetic code produced by gcc?
unsigned long x, y, sum;
int overflow = 0;
sum = x+y;
if (sum < x)
overflow = 1;
Works portably in C (but only with "unsigned" types that are guaranteed
to have twos-complement behaviour and well-defined overflow semantics).
And good compilers will notice that "overflow" is really just the carry
flag, and optimize it. I don't remember if gcc counts as "good" in this
case, but I doubt it ;)
For signed integer arithmetic, you're basically screwed. A compiler is
in theory allowed to do anything at all for signed overflows, so you
cannot portably test the result at all: you have to detect the overflow
before the operation happens. In practice, of course, all x86 compilers
will just result in the "proper" overflow value, but you still have to
test whether an overflow occurred.
At some point it probably becomes easier to do it with inline asm.
Especially if it needs to be efficient. One (inefficient) way might be
to do the calculation in "long long" and do a
long long x, y, sum;
in toverflow = 0;
sum = x + y;
if (sum != (long)sum)
overflow = 1;
but if you look at the actual code it generates, you'll be really sad.
Linus
Drivers accepting values into kernel space are a major source of overflows (deliberate or not), e.g., x * y (x or y untrusted) overflows and invariably buffer overflow ensues. Another common problem is very carefully adding 1 to null terminate a string, user passes in a length of 0xfffffff which, +1 = zero and a buffer overrun.
Be very careful with structures as they are padded by the compiler. __attribute__((packed)) should be used with caution because on some architectures (e.g., IA64) these end up with unaligned access faults, which are slow and annoying.
Endianness, of course. Most basic porting problem but still is constantly missed.
Object Lifespan
For any given object you should know
- Where is it created?
- Where is it destroyed?
- The point where you know it is finished with.
For example, sit down on some kernels and plug and unplug a USB device constantly. Count how many times you can do it before you crash. This is simply because objects are not kept track of, or because of subtle race conditions with knowing when you can finish with your objects. Alan suggests of the many ways to tackle this sort of problem, reference counting (as in network code) is the best way.
Make common creation functions -- otherwise people forget to fill in bits of the object, it gets passed to another function that assumes something has been initalized properly which hasn't. Slab poisoning on Linux helps you with this sort of problem, and when it was introduced found a huge number of problems.
Think about the fields you are using -- the 2.0 alpha floppy driver only worked every other month, as the timer value was 32 bit! You might end up with software that literally only works once in a blue moon.
Make sure you know all resources freed on all paths -- small leak * year == big leak
portability
Use u8, u16, u32 etc. Short/Long suck!
Use unsigned long for Memory and I/O resources. This makes sure you are portable to other arch's like Sparc etc.
Check that ioremap memory is used with readb/writeb, not dereferenced directly and not used with memcpy/memset
dma
Beware code using virt_to_bus with DMA -- not on non-pc hardware
DMA not allowed for vmalloced memory, high pages, the stack - archs like ppc64 have a virtual stack. Also, you end up with bugs where DMA wasn't stopped and would overwrite your stack constantly (and while you are in the debugger). Once this happened with a TV driver, and someone could not figure out what was happening until they noticed a large number of repeating bytes on their stack and decided to decode it, and got a CNN logo ...
There is a DMA API now, use it.
PCI
PCI is hotplug now, 2.5 important
PCI is message based, interrupts and PCI data transfer are async and you never quite know what is coming next. You need to turn off interrupts, free resources and free irq before freeing resources else you fail the test above about always knowing when you are complete.
PCI devices may have 64 bit addressing (see unsigned long note above)
Resources are Finite
System calls - don't allocate user controlled amounts of memory! Don't queue undefined number of buffers of user data. These are fatal istakes.
Interrupt handing - make sure time in handler is bound. Beware of infinite memory allocations.
The stack - it's small. Typically 7k, 4k in interrupts. Use it wisely. This means no large recursive calls.
== Deadlocks ==
Spinlocks are not recursive!
Don't ever sleep with locks : 2.5 checks for this. copy to/from user space sleeps, but in 2.4 it almost always works anyway. Not so in 2.5
Don't add mdelay() -- one driver did this in an interrupt handler and could have potentially shut the machine down for 20 minutes.
IRQ/syscall deadlocks : can't disable irq holding spinlocks taken by irq handler. dead lock!
Termination Order
Ensure DMA is finished before freeing -- otherwise you end up overwriting data.
Use complete_and_exit with threads
Auditing Code Requires Understanding It
Kernel-doc (like java doc) puts comments right into code. It is good, use it. But must be kept up to date.
Submitting changes
Check your formatting. Seems silly but will stop patches getting taken.
