Generic Data transfer interface

To try and unify some of the code used we are using a generic data transfer interface for all types of devices. However this interface will be specialised for different classes of devices.

Write

write(struct device *device, struct iovec vector[], void *callback_info);

struct device *device is a pointer to a specific device instance on which to call write. This allows support for one driver controlling multiple bits of hardware.

FIXME: Is this actually a good idea? This means that all drivers are going to have to do indirect references for everything. An alternative would be to simply say each driver supports one device only, and load another copy of the driver for each device. (This is not unlike using a parameterised template in C++). Basically the choice comes down to indirect reference vs. copying of code, and also how likely we are to have the same device multiple times, and want to run it in the same link unit. (Of course with a decent underlying OS, the text segment should actually be shared in memory anyway.)

PeterChubb: In-Kernel I think it's essential, as all drivers are in the same address space. Dunno how you cope in a Sasos?

void *callback_info is an opaque reference to data that will be returned to the caller and a completion callback is performed.

struct iovec vector[] is the most important argument. This is a list of descriptors describing data to write to the device.

The vector is a of list arrays. Any item in the array can be an actual decriptor, or a pointer to another array.

+-------------------+
| 0 | 1 | 2 | 3 | * |
+---------------- | +
                  |
                  +------------------------+
                  | 4 | 5 | 6 | * |  |  |  |
                  +------------ | ---------+
                                |
                                +---------------+
                                | 7 | 8 | 9 | # |
                                +---------------+

We have 10 struct iovec that are spread over three individual arrays. Any particular struct iovec may represent:

struct iovec {
        void *data;                 /* The actual data to be transferred */
        void *reference;            /* Opaque from the driver. May reference some
                                       other structure, e.g: a pbuf, or skbuff */
        union {
                size_t len;         /* Number of bytes to transfer */
                struct iovec *next; /* Link, if data == NULL */
        } ;
        size_t data_len;             /* Used in read, specifies how much data is in the buffer */
        <device class specific data>
};

FIXME: Note the addition of reference. This allows us to tag other data to each request. An alternative, maybe more efficient way to do this might be to have extra data inline as part of the iovec structure. This would mean that the exact makeup of the iovec structure depends on the class and also the shim layer, which isn't necessarily a bad thing.

Block

struct block_iovec {
        struct iovec iovec;
        off_t offset;         /* Offset on this disk */
};

Network

struct network_iovec {
        struct iovec iovec;
        bool end_of_packet;  /* Each iovec is expected to represent a packet fragment on network,
                               so this flag indicates the end of a packet. NB: This replaces one of
                               the evil 'more' bits */
        bool urgent;         /* Set if the packet is urgent. */
};

Stream

struct stream_iovec {
        struct iovec iovec;
};

The writev call should return after queuing the data for transfer. When actual transfer of the data is complete, a registered callback function is performed.

Read

The readv() call is very similar to the previously described writev call. Note that the driver does not allocate any buffers, the calling code is expected to handle all this policy. The readv function would effectively replace the "feed packets" call we currently have in network drivers.

One important difference exists for read, compared to write. For devices that process incoming data, there may not be actual data available reading. For example when doing a read from a network device packets are not always available.

Rather than the callback occurring on completion of all data transferred the shim layer may in fact receive multiple callbacks, as data decriptors in the iovec list are used. The callback will occur at least once per interrupt if data is available.

Locking requirements

The shim layer should ensure that readv(), writev(), and interrupt() are never called concurrently. Such a restriction allows the driver to be written and on uniprocessor machines provides effeciency by not needing to take locks. It will be further research to determine if such a policy is detrimental on SMP machines.

IA64wiki: UserLevelDrivers/RFC/GenericDataXferInterface (last edited 2009-12-10 03:14:01 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.