- Make sure that the system is really quiescent when starting an experiment: leave enough time to ensure all previous operations are really complete.
Always verify the data you are transferring. When writing something to disk or network, read it back and compare to what you've written. When reading, check that what you're reading is correct. There are cases where this would unreasonably lengthen the time a benchmark takes. If that's the case, then make sure that you at least check the data for one complete run, before continuing. Also, prior to collecting final numbers, check again!
never use the same data over and over. Make sure that each run uses different data. For example, have a timestamp or other unique dentifier (like the coordinate and label in the graph) in the data. This is to ensure that you're actually reading the correct data, not some stale cache contents, wrong block, etc.
Always do several runs, and check the standard deviation. Watch out for abnormal variance.
- Use a combination of successive and separate runs for the same data point. E.g. do the same point at least twice in a row (helps to identify caching effects that shouldn't be there) and twice more after some other points were taken (to identify cases of caching where there shouldn't be any). Have a good look at the standard deviations.
- Invert the order of measurements, helps to identify interference between measurements. This and the previous point can together be achieved by traversing the set of data points in both directions.
Don't only use regular strides or powers of two. You may be hitting pathological cases without noticing it. Throwing in some random points might be a good idea. However, don't use _only_ random points, you might be missing pathological cases. Good candidates for pathological cases are 2n, 2n-1,2n+1.
When comparing measurements of different configurations Make sure you use exactly the same points, don't just compare result graphs over the same interval.
