Asynchronous RPC
Contents
AsyncRPC is a non-blocking and not a wholly asynchronous RPC library. It provides abstractions over the Sockets layer for non-blocking transmission of RPC calls and reception of RPC replies. It provides notification of RPC replies through callbacks. Callbacks can be registered at the time the RPC call is transmitted through an interface function along with some private data that could be required during the callback.
It is used as the RPC library for libnfsclient, a userland NFS client operations library, which in turn is used by a tool called nfsreplay.
The NFS benchmarking project page is here: NFSBenchmarking
I can be reached at <shehjart AT gelato DOT NO SPAM unsw DOT edu GREEBLIES DOT au>
News
April 5, 2007, nfsreplay svn is up
March 31, 2007 Async RPC is still pre-alpha. Use with caution.
Main features
Non-blocking, the socket reads and writes are non-blocking and managed by the library. In case of writes, if the socket blocks the data is copied into internal buffers for trying again later.
Asynchronous, the responses are notified via callbacks registered at the time of making the RPC calls. These callbacks are not true asynchronous mechanisms as they do not rely on signals or other asynchronous notifications mechanisms. In the worst case scenario, a completion function needs to be called to explicitly process pending replies and the associated callbacks.
Interface
The interface is very similar to the RPC library in the glibc, with the addition of callbacks and non-blocking socket IO.
Creating Client Handle
#include <clnt_tcp_nb.h> CLIENT *clnttcp_nb_create(struct sockaddr_in *raddr, u_long prog, u_long vers, int *sockp, u_int sbufsz, u_int rbufsz); CLIENT *clnttcp_b_create(struct sockaddr_in *raddr, u_long prog, u_long vers, int *sockp, u_int sbufsz, u_int rbufsz);
Use clnttcp_nb_create to initiate a connection to a remote server using a non-blocking socket. The 'clnt_b_create does the same using a blocking socket. The parameters are:
raddr - Socket which provides the server's IP and optionally, a port to connect to. The port number is optional and if 0, is acquired from the portmapper service using the prog and vers parameters.
prog - The number identifying the RPC program.
vers - The version of the RPC program.
sockp - If the caller already has a usable socket descriptor, pass it as this argument. A new socket descriptor is created if the value of *sockp is RPC_ANYSOCK.
sbufsz - The size of the buffer which is sent to the write syscall. Uses default value of ASYNC_READ_BUF if 0.
rbufsz - The size of the buffer which is given to the read syscall. Uses default value of ASYNC_READ_BUF if 0.
The function returns a handle which is used to identify this particular connection.
User callbacks
User callbacks are of the type
#include <clnt_tcp_nb.h> typedef void (*user_cb)(void *msg_buf, int bufsz, void *priv);
msg_buf - Pointer to the msg buffer.
bufsz - Size in bytes of the message in msg_buf.
priv - Pointer to the private data registered with clnttcp_nb_call.
Calling Remote Procedures
#include <clnt_tcp_nb.h>
extern enum clnt_stat clnttcp_nb_call(CLIENT *handle, u_long proc,
xdrproc_t inproc, caddr_t inargs, user_cb callback, void * usercb_priv);
clnttcp_nb_call is the function used call remote procedures asynchronously.
handle - The pointer to the handle returned by clnttcp_nb_create.
proc - The RPC procedure number.
inproc - Function that is used to translate user message into XDR format.
inargs - Pointer to user message.
callback - The callback function. Its called when the reply is received for this RPC message.
usercb_priv - The pointer to the private data that will be passed as the third argument to the function pointed to by callback.
On a successful transmission of the call, the return value is RPC_SUCCESS. This applies only to the send function. RPC_SUCCESS is returned even in cases, when the message is copied into internal buffers for later transmission. This would happen in case the write syscall returns EAGAIN to notify that the call will block.
clnttcp_nb_call transparently handles blocking and non-blocking sockets so there is no need to maintain additional state after the client handle has been created.
Executing callbacks
#include <clnt_tcp_nb.h> int clnttcp_nb_receive(CLIENT * handle, int flag);
handle - The pointer to the handle returned by clnttcp_nb_create.
flag - This argument takes the following values:
RPC_NONBLOCK_WAIT - If the user application requires that the read from socket for this invocation of clnttcp_nb_receive be non-blocking.
RPC_BLOCKING_WAIT - If the user application requires that the read from socket for this invocation of clnttcp_nb_receive be blocking till atleast one RPC response was received by the library, i.e. atleast a one callback was executed by the library internally.
The flag argument determines socket read behaviour in tandem with the original socket creation type. The following table shows the resulting combinations:
Flag
RPC_BLOCKING_WAIT
RPC_NONBLOCK_WAIT
Socket Type
Non-blocking
BLOCKING read()
NON-BLOCKING read()
Blocking
BLOCKING read()
BLOCKING read()
The idea above is to show that using the flag as RPC_BLOCKING_WAIT even a non-blocking socket can block-wait for a response if necessary. The function returns the count of callbacks that were executed for the socket buffers that were processed. This value can also be taken to be the count of replies received and processed in this call.
Closing a connection
#include <clnt_tcp_nb.h> void clnttcp_nb_destroy (CLIENT *h);
Simply call clnttcp_nb_destroy to the state related to this connection.
h - The pointer to the handle returned by clnttcp_nb_create.
Retreiving amount of data transferred
#include <clnt_tcp_nb.h> unsigned long clnttcp_datatx (CLIENT *h);
Returns the count of bytes transferred over this CLIENT handle.
Internals
Some aspects that need focus are presented here.
Plugability into glibc
Since glibc's RPC implementation has some degree of extensibility I've been able to use quite a bit of underlying infrastructure. It allows for new pluggable transport protocol handlers, pluggable XDR translation libraries and pluggable functions that actually do the reading and writing from sockets.
Some aspects of glibc RPC code structure are shown in the two pages here, which are basically pictures of diagrams I drew on a whiteboard to understand it myself. See glibcRPCDesign.
Record Stream Management
The RPC Record Marking Standard is used for serializing RPC messages over byte-stream transports like TCP. Since we have two different paths for sending(using clnttcp_nb_call) and receiving RPC messages(through callbacks), the XDR translation takes place differently in both cases.
Transmission case: While sending the Async RPC code uses glibc's XDRREC translation routines which are used for XDR translation for record streams like TCP. XDRREC in turn provides pluggable functions which are used for writing the translated messages to socket descriptors. The Async RPC library defines a custom function that is plugged into to the XDRREC routines. This function, writetcp_nb handles non-blocking writes to socket descriptors. This function is the last one to be called by the XDRREC routines which means the buffers passed to it contain RPC messages already in XDR format. If the write to socket blocks, it copies the message into internal buffers for later transmission. The buffers are stored in the client handles. The code is well commented.
Reception case: Message reception takes place either during the calls itself, in case the socket used is blocking or by using clnttcp_nb_receive function. Mainly, the task involves defragmenting(..RPC terminology..) and desegmenting(..TCP terminology..) bytes read from the socket and collating them into single RPC records. Each RPC Record contains one RPC message. The user callbacks are called only when enough bytes have been read to complete a full RPC record. See RFC 1831 Section on Record Marking Standard for more info. The bytes read from the sockets are in XDR format. The RPC message headers are un-XDRed using the XDRMEM routines which allow translation to and from buffers in memory. This is different from the XDRREC module which sends message translations to socket descriptors from in-memory buffers and the XDR data read from socket descriptors to un-XDRed memory buffers.
Callbacks
Callbacks are called only with complete RPC messages. The buffers passed to the callbacks are in XDR format and need to be translated before being useful. Use the glibc XDRMEM routines to do this. For examples of use with NFS messages, see the XDR translation routines in libnfsclient callbacks. The source for libnfsclient is packaged as part of nfsreplay.
Callbacks are saved internally in a hashtable by using the RPC XID as the key. As each call produces a unique XID, each message needs a callback to be registered while sending that message. Registering a callback is optional and the library discards a reply which does not have a registered callback.
Callbacks might need some state information while processing each reply. This state can be provided as the user_cb_priv argument to clnttcp_nb_call. This reference is passed to the callback eventually as the priv argument of the callback functions, which are of the type user_cb. This approach provides for per-XID callback and private info, i.e. the callback and the private data passed to it can be different for each request.
The message buffers passed to the callbacks are freed after the callbacks return. Copy it, if persistence is needed.
Response Notification
Response notification happens through callbacks. Within the library, response processing is attempted right after the RPC call is made by clnttcp_nb_call. The attempt to read a response blocks if the socket was created using clnttcp_b_create. If the read results in a full RPC record being read, the callback is executed and clnttcp_nb_call returns. In case the socket was non-blocking and the read returns EAGAIN, clnttcp_nb_call returns without calling any callbacks. In such cases, the user application might need to explicitly initiate the callbacks using a completion function. The clnttcp_nb_receive function is used for this purpose. Again, clnttcp_nb_receive allows the user application to explicitly specify whether this invocation of clnttcp_nb_receive should block. In case there are buffers that can read without blocking, they are read in. The callbacks are called only if these buffers collate to form a complete RPC record. See the description of clnttcp_nb_receive to understand under what conditions it will block till atleast one callback is executed.
Code
libnfsclient and AsyncRPC are part of the nfsreplay source package. See nfsreplay page for instructions on checking out these two components.
Usage
Usage and building with the library involves including the header file and building the library user's C files with the clnt_tcp_nb.c source file.
The header clnt_tcp_nb.h is needed by all the files that use the interface functions above.
Support
Use nfsreplay lists for support and discussion.
