Linux Device Drivers 2nd Edition (779877), страница 90
Текст из файла (страница 90)
The kernelcan dispose of virtual addresses only at the level of page tables; therefore, themapped area must be a multiple of PAGE_SIZE and must live in physical memorystarting at an address that is a multiple of PAGE_SIZE. The kernel accommodatesfor size granularity by making a region slightly bigger if its size isn’t a multiple ofthe page size.These limits are not a big constraint for drivers, because the program accessing thedevice is device dependent anyway. It needs to know how to make sense of thememory region being mapped, so the PAGE_SIZE alignment is not a problem. Abigger constraint exists when ISA devices are used on some non-x86 platforms,because their hardware view of ISA may not be contiguous.
For example, someAlpha computers see ISA memory as a scattered set of 8-bit, 16-bit, or 32-bit items,with no direct mapping. In such cases, you can’t use mmap at all. The inability toperform direct mapping of ISA addresses to Alpha addresses is due to the incompatible data transfer specifications of the two systems. Whereas early Alpha processors could issue only 32-bit and 64-bit memory accesses, ISA can do only 8-bitand 16-bit transfers, and there’s no way to transparently map one protocol ontothe other.38322 June 2001 16:42http://openlib.org.uaChapter 13: mmap and DMAThere are sound advantages to using mmap when it’s feasible to do so. Forinstance, we have already looked at the X server, which transfers a lot of data toand from video memory; mapping the graphic display to user space dramaticallyimproves the throughput, as opposed to an lseek/write implementation.
Anothertypical example is a program controlling a PCI device. Most PCI peripherals maptheir control registers to a memory address, and a demanding application mightprefer to have direct access to the registers instead of repeatedly having to callioctl to get its work done.The mmap method is part of the file_operations structure and is invokedwhen the mmap system call is issued.
With mmap, the kernel performs a gooddeal of work before the actual method is invoked, and therefore the prototype ofthe method is quite different from that of the system call. This is unlike calls suchas ioctl and poll, where the kernel does not do much before calling the method.The system call is declared as follows (as described in the mmap(2) manual page):mmap (caddr_t addr, size_t len, int prot, int flags, int fd,off_t offset)On the other hand, the file operation is declared asint (*mmap) (struct file *filp, struct vm_area_struct *vma);The filp argument in the method is the same as that introduced in Chapter 3,while vma contains the information about the virtual address range that is used toaccess the device.
Much of the work has thus been done by the kernel; to implement mmap, the driver only has to build suitable page tables for the address rangeand, if necessary, replace vma->vm_ops with a new set of operations.There are two ways of building the page tables: doing it all at once with a function called remap_ page_range, or doing it a page at a time via the nopage VMAmethod. Both methods have their advantages. We’ll start with the ‘‘all at once’’approach, which is simpler.
From there we will start adding the complicationsneeded for a real-world implementation.Using remap_page_rangeThe job of building new page tables to map a range of physical addresses is handled by remap_ page_range, which has the following prototype:int remap_page_range(unsigned long virt_add, unsigned long phys_add,unsigned long size, pgprot_t prot);The value returned by the function is the usual 0 or a negative error code.
Let’slook at the exact meaning of the function’s arguments:38422 June 2001 16:42http://openlib.org.uaThe mmap Device Operationvirt_addThe user virtual address where remapping should begin. The function buildspage tables for the virtual address range between virt_add andvirt_add+size.phys_addThe physical address to which the virtual address should be mapped. Thefunction affects physical addresses from phys_add to phys_add+size.sizeThe dimension, in bytes, of the area being remapped.protThe ‘‘protection’’ requested for the new VMA.
The driver can (and should) usethe value found in vma->vm_page_prot.The arguments to remap_ page_range are fairly straightforward, and most of themare already provided to you in the VMA when your mmap method is called. Theone complication has to do with caching: usually, references to device memoryshould not be cached by the processor.
Often the system BIOS will set things upproperly, but it is also possible to disable caching of specific VMAs via the protection field. Unfortunately, disabling caching at this level is highly processor dependent. The curious reader may wish to look at the function pgpr ot_noncached fromdrivers/char/mem.c to see what’s involved. We won’t discuss the topic furtherhere.A Simple ImplementationIf your driver needs to do a simple, linear mapping of device memory into a useraddress space, remap_ page_range is almost all you really need to do the job. Thefollowing code comes from drivers/char/mem.c and shows how this task is performed in a typical module called simple (Simple Implementation Mapping Pageswith Little Enthusiasm):#include <linux/mm.h>int simple_mmap(struct file *filp, struct vm_area_struct *vma){unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;if (offset >= _ _pa(high_memory) || (filp->f_flags & O_SYNC))vma->vm_flags |= VM_IO;vma->vm_flags |= VM_RESERVED;if (remap_page_range(vma->vm_start, offset,vma->vm_end-vma->vm_start, vma->vm_page_prot))return -EAGAIN;return 0;}38522 June 2001 16:42http://openlib.org.uaChapter 13: mmap and DMAThe /dev/mem code checks to see if the requested offset (stored invma->vm_pgoff) is beyond physical memory; if so, the VM_IO VMA flag is set tomark the area as being I/O memory.
The VM_RESERVED flag is always set to keepthe system from trying to swap this area out. Then it is just a matter of callingremap_ page_range to create the necessary page tables.Adding VMA OperationsAs we have seen, the vm_area_struct structure contains a set of operationsthat may be applied to the VMA. Now we’ll look at providing those operations ina simple way; a more detailed example will follow later on.Here, we will provide open and close operations for our VMA. These operationswill be called anytime a process opens or closes the VMA; in particular, the openmethod will be invoked anytime a process forks and creates a new reference tothe VMA. The open and close VMA methods are called in addition to the processing performed by the kernel, so they need not reimplement any of the work donethere. They exist as a way for drivers to do any additional processing that theymay require.We’ll use these methods to increment the module usage count whenever the VMAis opened, and to decrement it when it’s closed.
In modern kernels, this work isnot strictly necessary; the kernel will not call the driver’s release method as long asa VMA remains open, so the usage count will not drop to zero until all referencesto the VMA are closed. The 2.0 kernel, however, did not perform this tracking, soportable code will still want to be able to maintain the usage count.So, we will override the default vma->vm_ops with operations that keep track ofthe usage count. The code is quite simple—a complete mmap implementation fora modularized /dev/mem looks like the following:void simple_vma_open(struct vm_area_struct *vma){ MOD_INC_USE_COUNT; }void simple_vma_close(struct vm_area_struct *vma){ MOD_DEC_USE_COUNT; }static struct vm_operations_struct simple_remap_vm_ops = {open: simple_vma_open,close: simple_vma_close,};int simple_remap_mmap(struct file *filp, struct vm_area_struct *vma){unsigned long offset = VMA_OFFSET(vma);if (offset >= _ _pa(high_memory) || (filp->f_flags & O_SYNC))vma->vm_flags |= VM_IO;vma->vm_flags |= VM_RESERVED;38622 June 2001 16:42http://openlib.org.uaThe mmap Device Operationif (remap_page_range(vma->vm_start, offset, vma->vm_end-vma->vm_start,vma->vm_page_prot))return -EAGAIN;vma->vm_ops = &simple_remap_vm_ops;simple_vma_open(vma);return 0;}This code relies on the fact that the kernel initializes to NULL the vm_ops field inthe newly created area before calling f_op->mmap.
The code just shown checksthe current value of the pointer as a safety measure, should something change infuture kernels.The strange VMA_OFFSET macro that appears in this code is used to hide a difference in the vma structure across kernel versions. Since the offset is a number ofpages in 2.4 and a number of bytes in 2.2 and earlier kernels, <sysdep.h>declares the macro to make the difference transparent (and the result is expressedin bytes).Mapping Memory with nopageAlthough remap_ page_range works well for many, if not most, driver mmapimplementations, sometimes it is necessary to be a little more flexible.
In such situations, an implementation using the nopage VMA method may be called for.The nopage method, remember, has the following prototype:struct page (*nopage)(struct vm_area_struct *vma,unsigned long address, int write_access);When a user process attempts to access a page in a VMA that is not present inmemory, the associated nopage function is called. The address parameter willcontain the virtual address that caused the fault, rounded down to the beginningof the page.
The nopage function must locate and return the struct pagepointer that refers to the page the user wanted. This function must also take careto increment the usage count for the page it returns by calling the get_ page macro:get_page(struct page *pageptr);This step is necessary to keep the reference counts correct on the mapped pages.The kernel maintains this count for every page; when the count goes to zero, thekernel knows that the page may be placed on the free list.
When a VMA isunmapped, the kernel will decrement the usage count for every page in the area.If your driver does not increment the count when adding a page to the area, theusage count will become zero prematurely and the integrity of the system will becompromised.38722 June 2001 16:42http://openlib.org.uaChapter 13: mmap and DMAOne situation in which the nopage approach is useful can be brought about by themr emap system call, which is used by applications to change the boundingaddresses of a mapped region. If the driver wants to be able to deal with mr emap,the previous implementation won’t work correctly, because there’s no way for thedriver to know that the mapped region has changed.The Linux implementation of mr emap doesn’t notify the driver of changes in themapped area.