NFS Benchmarking

News

Intro

Welcome to the NFS Benchmarking project at Gelato@UNSW. We're involved in regular Linux NFS performance measurements, developing new NFS benchmarking tools and improving Linux NFS scalability and performance.

Here you'll find information about tools, code, results, etc, generated during the course of the project. For a personal perspective and brief write-ups, tutorials and updates on tools being developed here, see http://nfsfoo.livejournal.com/

A very brief history of the project is here, NFSBenchHist

Please contact me at <shehjart AT gelato DOT NO SPAM unsw DOT edu GREEBLIES DOT au> for more info. My personal page is here: http://www.gelato.unsw.edu.au/~shehjart

Tools and Code

These are some of the tools I've been using and developing.

tshark

tcpdump and tshark are two popular network traffic analyzers. For traffic capture, I recommend using tcpdump, since it is lightweight and available on most systems. For NFS trace analysis and dissection, I recommend using TShark since analysing RPC-over-TCP traffic requires stateful packet dissection, which TShark provides, note however, that even TShark wont be of much help if full packet capture was not enabled while using tcpdump during packet capture. For more information on RPC-over-TCP segments and problems with using tcpdump to analyse the trace, see RPCOverTCPCapture page.

Traffic anonymizer

NFSv3 message payload contains various fields which leak private and confidential information into the network, for example, UIDs, GIDs, and even contents of files. For the purpose of traffic analysis and replay, these fields need to be anonymized so that the traces can be released into public domain. I've developed a NFS traffic anonymizer for binary NFS traces(..in pcap format..) as an extension of the tshark network analyzer. I used tshark since its packet buffer management infrastructure allows stateful packet dissection.

The NFS traffic anonymizer has a separate page. http://www.gelato.unsw.edu.au/IA64wiki/NFSTrafficAnonymizer

nfsreplay

Performance measurement and benchmarking is a hard task. Especially so when there are different levels(micro- and macro-) and workloads at which it can be performed. My aim has been to benchmark NFS servers under realistic conditions. To that end, I've developed nfsreplay, a tool that replays NFS traces against target servers. The idea is to benchmark the performance of the target servers while exploiting the maximum realism, in terms of workload, available in the NFS trace. nfsreplay can be called a macro-benchmarking tool though it is not application-specific because the application load can be ranged and scaled by changing the trace or certain replay parameters.

nfsreplay along with other associated tools resides at:

Asynchronous RPC

The Sun RPC library that is part of the glibc package only allows blocking-wait or synchronous RPC semantics, which is the correct way to carry out remote procedure calls. We need the replayer to be highly scalable so this library is not ideal. An ideal library, for the purpose of replay and general scalability, would provide a non-blocking transmission interface and asynchronous notifications of responses from the server. I've developed such a library which is an extension of the Sun RPC library. The Sun RPC library design allows pluggable handlers for different types of transport protocols. The built-in types are available for UDP, called clnt_udp and for TCP called clnt_tcp. See man rpc(3) for more on the Sun RPC interface in glibc. I've written a new non-blocking and asynchronous extension called clnt_tcp_nb.

The project page is here. http://www.gelato.unsw.edu.au/IA64wiki/AsyncRPC

libnfsclient, NFS Client Ops Library

libnfsclient is a userspace NFS client operations library. It provides a clean interface for sending individual NFS requests to a remote NFS server. Underneath the NFS-specific interface, it uses the AsyncRPC library which allows users of libnfsclient to specify callbacks for each request. These callbacks are executed on receiving the response for a particular NFS request. Since the AsyncRPC library does not use signal-driven IO, both AsyncRPC, libnfsclient and any user-specified callbacks are re-entrant.

The project page is here. http://www.gelato.unsw.edu.au/IA64wiki/libnfsclient

Both the above libraries are part of the nfsreplay repository and can be checked out from the URLs on nfsreplay web page though the code can be built and used without nfsreplay.

nfsdump and TBBT

nfsdump is a NFS traffic capture program. It captures NFS traffic correctly only if the underlying transport is UDP, due to the reasons discussed on the RPCOverTCPCapture page. I tried to use TBBT as a NFS traffic replayer during December 2005 - Feb 2006 internship. TBBT and nfsdump are closely tied together so they have a separate page at TBBT.

NFS Performance Measurements

NFS server measurements will be updated on the NFSServerPerformance page. For now the client measurements are available at NFSClientPerformance page.

NFS Traffic Analysis

For use with nfsreplay, I have three large NFS traces. One donated by UNSW's School of Computer Science and Engineering and the other two from Ohio Supercomputing Center. To understand the workload in the traces, I have performed analysis of the access and timing patterns in them. Basic analysis of the the traces is presented in the nfsreplay technical report. This page provides more details, including the server behaviour when these traces were being captured. The page is at NFSTrafficAnalysis.

The original traces are pretty huge and constitute captures of around 12-24 hours. Replaying such huge traces will be too much overhead so I've added an additional feature into tracedigester to process and create the file system hierarchy by sampling the trace, i.e. we're able to specify a sample rate value. This sample rate is used by tracedigester to scale down the trace by looking at every <rate>'th request. This results in a trace subset which still maintains the access pattern observed in the original trace, but with a lower number of NFS requests. For analysis of the sampled trace subsets, see SampledTraceSubsets.

Papers/Reports

  1. Sources of NFS Performance, Shehjar Tikoo
    Discussion and investigation of measurements of factors influencing NFS performance.
    http://gelato.unsw.edu.au/~shehjart/download/nfs_sources_tr.pdf

  2. nfsreplay: Trace Replay Based NFS benchmarking, Shehjar Tikoo, Dr. Peter Chubb.
    Technical report describing the nfsreplay suite. See nfsreplayTR.

  3. Benchmarking NFS Performance and Scalability on Linux, Shehjar Tikoo, Dr. Peter Chubb.
    Summer internship project report, Dec 2005 - Feb 2006
    Describes traffic replay experience with TBBT and nfsdump. Discusses NFS trace analysis techniques.
    http://www.gelato.unsw.edu.au/~shehjart/download/nfs_report.pdf

Slides/Talks

  1. NFS Benchmarking using nfsreplay, Connectathon, May 2008
    http://www.gelato.unsw.edu.au/~shehjart/download/nfsreplay_cthon_may08.pdf

  2. NFS Benchmarking using nfsreplay, Gelato Itanium Conference and Expo, October 2007

    http://gelato.unsw.edu.au/~shehjart/download/nfsreplay_ice_oct07.pdf

  3. Linux NFS client benchmarks, Gelato Itanium Conference and Expo, April 2007

    http://gelato.unsw.edu.au/~shehjart/download/nfsbench_gice_apr07.ps

  4. Gelato Itanium Conference and Expo, April 2006

    http://www.ertos.nicta.com.au/publications/papers/Tikoo_Chubb_06.abstract.pml

Bibliography

A list of external references, slides, papers, tech. reports is on a separate page at NFSBiblio

IA64wiki: NFSBenchmarking (last edited 2009-12-10 03:13:50 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.