Linux Device Drivers 2nd Edition (779877), страница 77
Текст из файла (страница 77)
One difference is already evident: register_chrdev took a pointer to a file_operations structure, but register_blkdevuses a structure of type block_device_operations instead — as it has sincekernel version 2.3.38. The structure is still sometimes referred to by the namefops in block drivers; we’ll call it bdops to be more faithful to what the structureis and to follow the suggested naming. The definition of this structure is as follows:struct block_device_operations {int (*open) (struct inode *inode, struct file *filp);int (*release) (struct inode *inode, struct file *filp);int (*ioctl) (struct inode *inode, struct file *filp,unsigned command, unsigned long argument);int (*check_media_change) (kdev_t dev);int (*revalidate) (kdev_t dev);};32222 June 2001 16:41http://openlib.org.uaRegistering the DriverThe open, release, and ioctl methods listed here are exactly the same as their chardevice counterparts. The other two methods are specific to block devices and arediscussed later in this chapter.
Note that there is no owner field in this structure;block drivers must still maintain their usage count manually, even in the 2.4 kernel.The bdops structure used in sbull is as follows:struct block_device_operations sbull_bdops = {open:sbull_open,release:sbull_release,ioctl:sbull_ioctl,check_media_change: sbull_check_change,revalidate:sbull_revalidate,};Note that there are no read or write operations provided in theblock_device_operations structure.
All I/O to block devices is normallybuffered by the system (the only exception is with ‘‘raw’’ devices, which we coverin the next chapter); user processes do not perform direct I/O to these devices.User-mode access to block devices usually is implicit in filesystem operations theyperform, and those operations clearly benefit from I/O buffering. However, even‘‘direct’’ I/O to a block device, such as when a filesystem is created, goes throughthe Linux buffer cache.* As a result, the kernel provides a single set of read andwrite functions for block devices, and drivers do not need to worry about them.Clearly, a block driver must eventually provide some mechanism for actually doingblock I/O to a device. In Linux, the method used for these I/O operations is calledrequest; it is the equivalent of the ‘‘strategy’’ function found on many Unix systems.
The request method handles both read and write operations and can besomewhat complex. We will get into the details of request shortly.For the purposes of block device registration, however, we must tell the kernelwhere our request method is. This method is not kept in theblock_device_operations structure, for both historical and performance reasons; instead, it is associated with the queue of pending I/O operations for thedevice.
By default, there is one such queue for each major number. A block drivermust initialize that queue with blk_init_queue. Queue initialization and cleanup isdefined as follows:#include <linux/blkdev.h>blk_init_queue(request_queue_t *queue, request_fn_proc *request);blk_cleanup_queue(request_queue_t *queue);* Actually, the 2.3 development series added the raw I/O capability, allowing user processes to write to block devices without involving the buffer cache. Block drivers, however, are entirely unaware of raw I/O, so we defer the discussion of that facility to thenext chapter.32322 June 2001 16:41http://openlib.org.uaChapter 12: Loading Block DriversThe init function sets up the queue, and associates the driver’s request function(passed as the second parameter) with the queue.
It is necessary to callblk_cleanup_queue at module cleanup time. The sbull driver initializes its queuewith this line of code:blk_init_queue(BLK_DEFAULT_QUEUE(major), sbull_request);Each device has a request queue that it uses by default; the macroBLK_DEFAULT_QUEUE(major) is used to indicate that queue when needed.This macro looks into a global array of blk_dev_struct structures calledblk_dev, which is maintained by the kernel and indexed by major number.
Thestructure looks like this:struct blk_dev_structrequest_queue_tqueue_procvoid};{request_queue;*queue;*data;The request_queue member contains the I/O request queue that we have justinitialized. We will look at the queue member shortly. The data field may beused by the driver for its own data—but few drivers do so.Figure 12-1 visualizes the main steps a driver module performs to register with thekernel proper and deregister.
If you compare this figure with Figure 2-1, similarities and differences should be clear.In addition to blk_dev, several other global arrays hold information about blockdrivers. These arrays are indexed by the major number, and sometimes also theminor number. They are declared and described in drivers/block/ll_rw_block.c.int blk_size[][];This array is indexed by the major and minor numbers. It describes the size ofeach device, in kilobytes. If blk_size[major] is NULL, no checking is performed on the size of the device (i.e., the kernel might request data transferspast end-of-device).int blksize_size[][];The size of the block used by each device, in bytes.
Like the previous one,this bidimensional array is indexed by both major and minor numbers. Ifblksize_size[major] is a null pointer, a block size of BLOCK_SIZE (currently 1 KB) is assumed. The block size for the device must be a power oftwo, because the kernel uses bit-shift operators to convert offsets to blocknumbers.int hardsect_size[][];Like the others, this data structure is indexed by the major and minor numbers.
The default value for the hardware sector size is 512 bytes. With the 2.2and 2.4 kernels, different sector sizes are supported, but they must always bea power of two greater than or equal to 512 bytes.32422 June 2001 16:41http://openlib.org.uaRegistering the DriverModuleinsmodKernel Properinit_module()register_blkdev()blk_init_queue()blk_dev[]default queuerequest()blkdevs[]block_device_opsrmmodcleanup_module()unregister_blkdev()blk_cleanup_queue()KEYOne FunctionDataMultiple FunctionsFunction callData pointerFunction pointerAssignment to dataFigur e 12-1.
Registering a Block Device Driverint read_ahead[];int max_readahead[][];These arrays define the number of sectors to be read in advance by the kernelwhen a file is being read sequentially. read_ahead applies to all devices ofa given type and is indexed by major number; max_readahead applies toindividual devices and is indexed by both the major and minor numbers.32522 June 2001 16:41http://openlib.org.uaChapter 12: Loading Block DriversReading data before a process asks for it helps system performance and overall throughput. A slower device should specify a bigger read-ahead value,while fast devices will be happy even with a smaller value.
The bigger theread-ahead value, the more memory the buffer cache uses.The primary difference between the two arrays is this: read_ahead isapplied at the block I/O level and controls how many blocks may be readsequentially fr om the disk ahead of the current request. max_readaheadworks at the filesystem level and refers to blocks in the file, which may not besequential on disk. Kernel development is moving toward doing read ahead atthe filesystem level, rather than at the block I/O level. In the 2.4 kernel, however, read ahead is still done at both levels, so both of these arrays are used.There is one read_ahead[] value for each major number, and it applies toall its minor numbers.
max_readahead, instead, has a value for every device.The values can be changed via the driver’s ioctl method; hard-disk drivers usually set read_ahead to 8 sectors, which corresponds to 4 KB. Themax_readahead value, on the other hand, is rarely set by the drivers; itdefaults to MAX_READAHEAD, currently 31 pages.int max_sectors[][];This array limits the maximum size of a single request.
It should normally beset to the largest transfer that your hardware can handle.int max_segments[];This array controlled the number of individual segments that could appear in aclustered request; it was removed just before the release of the 2.4 kernel,however. (See “Clustered Requests” later in this chapter for information onclustered requests).The sbull device allows you to set these values at load time, and they apply to allthe minor numbers of the sample driver.
The variable names and their default values in sbull are as follows:size=2048 (kilobytes)Each RAM disk created by sbull takes two megabytes of RAM.blksize=1024 (bytes)The software ‘‘block’’ used by the module is one kilobyte, like the systemdefault.hardsect=512 (bytes)The sbull sector size is the usual half-kilobyte value.rahead=2 (sectors)Because the RAM disk is a fast device, the default read-ahead value is small.The sbull device also allows you to choose the number of devices to install.
devs,the number of devices, defaults to 2, resulting in a default memory usage of fourmegabytes — two disks at two megabytes each.32622 June 2001 16:41http://openlib.org.uaRegistering the DriverThe initialization of these arrays in sbull is done as follows:read_ahead[major] = sbull_rahead;result = -ENOMEM; /* for the possible errors */sbull_sizes = kmalloc(sbull_devs * sizeof(int), GFP_KERNEL);if (!sbull_sizes)goto fail_malloc;for (i=0; i < sbull_devs; i++) /* all the same size */sbull_sizes[i] = sbull_size;blk_size[major]=sbull_sizes;sbull_blksizes = kmalloc(sbull_devs * sizeof(int), GFP_KERNEL);if (!sbull_blksizes)goto fail_malloc;for (i=0; i < sbull_devs; i++) /* all the same blocksize */sbull_blksizes[i] = sbull_blksize;blksize_size[major]=sbull_blksizes;sbull_hardsects = kmalloc(sbull_devs * sizeof(int), GFP_KERNEL);if (!sbull_hardsects)goto fail_malloc;for (i=0; i < sbull_devs; i++) /* all the same hardsect */sbull_hardsects[i] = sbull_hardsect;hardsect_size[major]=sbull_hardsects;For brevity, the error handling code (the target of the fail_malloc goto) hasbeen omitted; it simply frees anything that was successfully allocated, unregistersthe device, and returns a failure status.One last thing that must be done is to register every ‘‘disk’’ device provided by thedriver.