Linux Device Drivers 2nd Edition (779877), страница 22
Текст из файла (страница 22)
In both cases, the returnvalue is the amount of memory still to be copied. The scull code looks for thiserror return, and returns -EFAULT to the user if it’s not 0.The topic of user-space access and invalid user space pointers is somewhatadvanced, and is discussed in “Using the ioctl Argument” in Chapter 5. However,it’s worth suggesting that if you don’t need to check the user-space pointer youcan invoke _ _copy_to_user and _ _copy_fr om_user instead. This is useful, forexample, if you know you already checked the argument.As far as the actual device methods are concerned, the task of the read method isto copy data from the device to user space (using copy_to_user), while the writemethod must copy data from user space to the device (using copy_fr om_user).Each read or write system call requests transfer of a specific number of bytes, butthe driver is free to transfer less data—the exact rules are slightly different forreading and writing and are described later in this chapter.Whatever the amount of data the methods transfer, they should in general updatethe file position at *offp to represent the current file position after successfulcompletion of the system call.
Most of the time the offp argument is just apointer to filp->f_pos, but a different pointer is used in order to support thepr ead and pwrite system calls, which perform the equivalent of lseek and read orwrite in a single, atomic operation.Figure 3-2 represents how a typical read implementation uses its arguments.7922 June 2001 16:35http://openlib.org.uaChapter 3: Char Driversssize_t dev_read(struct file *file, char *buf, size_t count, loff_t *ppos);struct fileBuffer(in the driver)f_countf_flagsf_modef_posBuffer(in theapplicationor libc)copy_to_user()........Kernel SpaceUser Space(nonswappable)(swappable)Figure 3-2. The arguments to readBoth the read and write methods return a negative value if an error occurs.
Areturn value greater than or equal to 0 tells the calling program how many byteshave been successfully transferred. If some data is transferred correctly and thenan error happens, the return value must be the count of bytes successfully transferred, and the error does not get reported until the next time the function iscalled.Although kernel functions return a negative number to signal an error, and thevalue of the number indicates the kind of error that occurred (as introduced inChapter 2 in “Error Handling in init_module”), programs that run in user spacealways see –1 as the error return value. They need to access the errno variable tofind out what happened. The difference in behavior is dictated by the POSIX calling standard for system calls and the advantage of not dealing with errno in thekernel.The read MethodThe return value for read is interpreted by the calling application program as follows:•If the value equals the count argument passed to the read system call, therequested number of bytes has been transferred.
This is the optimal case.8022 June 2001 16:35http://openlib.org.uaread and write•If the value is positive, but smaller than count, only part of the data has beentransferred. This may happen for a number of reasons, depending on thedevice. Most often, the application program will retry the read. For instance, ifyou read using the fr ead function, the library function reissues the system calltill completion of the requested data transfer.•If the value is 0, end-of-file was reached.•A negative value means there was an error. The value specifies what the errorwas, according to <linux/errno.h>.
These errors look like -EINTR (interrupted system call) or -EFAULT (bad address).What is missing from the preceding list is the case of “there is no data, but it mayarrive later.” In this case, the read system call should block. We won’t deal withblocking input until “Blocking I/O” in Chapter 5.The scull code takes advantage of these rules.
In particular, it takes advantage ofthe partial-read rule. Each invocation of scull_r ead deals only with a single dataquantum, without implementing a loop to gather all the data; this makes the codeshorter and easier to read. If the reading program really wants more data, it reiterates the call. If the standard I/O library (i.e., fr ead and friends) is used to read thedevice, the application won’t even notice the quantization of the data transfer.If the current read position is greater than the device size, the read method ofscull returns 0 to signal that there’s no data available (in other words, we’re atend-of-file). This situation can happen if process A is reading the device whileprocess B opens it for writing, thus truncating the device to a length of zero.
Process A suddenly finds itself past end-of-file, and the next read call returns 0.Here is the code for read:ssize_t scull_read(struct file *filp, char *buf, size_t count,loff_t *f_pos){Scull_Dev *dev = filp->private_data; /* the first list item */Scull_Dev *dptr;int quantum = dev->quantum;int qset = dev->qset;int itemsize = quantum * qset; /* how many bytes in the list item */int item, s_pos, q_pos, rest;ssize_t ret = 0;if (down_interruptible(&dev->sem))return -ERESTARTSYS;if (*f_pos >= dev->size)goto out;if (*f_pos + count > dev->size)count = dev->size - *f_pos;/* find list item, qset index, and offset in the quantum */item = (long)*f_pos / itemsize;rest = (long)*f_pos % itemsize;8122 June 2001 16:35http://openlib.org.uaChapter 3: Char Driverss_pos = rest / quantum; q_pos = rest % quantum;/* follow the list up to the right position (defined elsewhere) */dptr = scull_follow(dev, item);if (!dptr->data)goto out; /* don’t fill holes */if (!dptr->data[s_pos])goto out;/* read only up to the end of this quantum */if (count > quantum - q_pos)count = quantum - q_pos;if (copy_to_user(buf, dptr->data[s_pos]+q_pos, count)) {ret = -EFAULT;goto out;}*f_pos += count;ret = count;out:up(&dev->sem);return ret;}The write Methodwrite, like read, can transfer less data than was requested, according to the following rules for the return value:•If the value equals count, the requested number of bytes has been transferred.•If the value is positive, but smaller than count, only part of the data has beentransferred.
The program will most likely retry writing the rest of the data.•If the value is 0, nothing was written. This result is not an error, and there isno reason to return an error code. Once again, the standard library retries thecall to write. We’ll examine the exact meaning of this case in “Blocking I/O” inChapter 5, where blocking write is introduced.•A negative value means an error occurred; like for read, valid error values arethose defined in <linux/errno.h>.Unfortunately, there may be misbehaving programs that issue an error messageand abort when a partial transfer is performed.
This happens because some programmers are accustomed to seeing write calls that either fail or succeed completely, which is actually what happens most of the time and should be supportedby devices as well. This limitation in the scull implementation could be fixed, butwe didn’t want to complicate the code more than necessary.8222 June 2001 16:35http://openlib.org.uaread and writeThe scull code for write deals with a single quantum at a time, like the readmethod does:ssize_t scull_write(struct file *filp, const char *buf, size_t count,loff_t *f_pos){Scull_Dev *dev = filp->private_data;Scull_Dev *dptr;int quantum = dev->quantum;int qset = dev->qset;int itemsize = quantum * qset;int item, s_pos, q_pos, rest;ssize_t ret = -ENOMEM; /* value used in "goto out" statements */if (down_interruptible(&dev->sem))return -ERESTARTSYS;/* find list item, qset index and offset in the quantum */item = (long)*f_pos / itemsize;rest = (long)*f_pos % itemsize;s_pos = rest / quantum; q_pos = rest % quantum;/* follow the list up to the right position */dptr = scull_follow(dev, item);if (!dptr->data) {dptr->data = kmalloc(qset * sizeof(char *), GFP_KERNEL);if (!dptr->data)goto out;memset(dptr->data, 0, qset * sizeof(char *));}if (!dptr->data[s_pos]) {dptr->data[s_pos] = kmalloc(quantum, GFP_KERNEL);if (!dptr->data[s_pos])goto out;}/* write only up to the end of this quantum */if (count > quantum - q_pos)count = quantum - q_pos;if (copy_from_user(dptr->data[s_pos]+q_pos, buf, count)) {ret = -EFAULT;goto out;}*f_pos += count;ret = count;/* update the size */if (dev->size < *f_pos)dev-> size = *f_pos;8322 June 2001 16:35http://openlib.org.uaChapter 3: Char Driversout:up(&dev->sem);return ret;}readv and writevUnix systems have long supported two alternative system calls named readv andwritev.