Linux Device Drivers 2nd Edition (779877), страница 94
Текст из файла (страница 94)
Thusit is just a matter of stepping through them all. Note, however, that kmap is usedto get a kernel virtual address for each page; in this way, the function will workeven if the user buffer is in high memory.Some quick tests copying data show that a copy to or from an sbullr device takesroughly two-thirds the system time as the same copy to the block sbull device.
Thesavings is gained by avoiding the extra copy through the buffer cache. Note that ifthe same data is read several times over, that savings will evaporate—especiallyfor a real hardware device. Raw device access is often not the best approach, butfor some applications it can be a major improvement.Although kiobufs remain controversial in the kernel development community,there is interest in using them in a wider range of contexts. There is, for example,a patch that implements Unix pipes with kiobufs—data is copied directly fromone process’s address space to the other with no buffering in the kernel at all.
Apatch also exists that makes it easy to use a kiobuf to map kernel virtual memoryinto a process’s address space, thus eliminating the need for a nopage implementation as shown earlier.40022 June 2001 16:42http://openlib.org.uaDirect Memory Access and Bus MasteringDirect Memory Access andBus MasteringDirect memory access, or DMA, is the advanced topic that completes our overviewof memory issues. DMA is the hardware mechanism that allows peripheral components to transfer their I/O data directly to and from main memory without theneed for the system processor to be involved in the transfer. Use of this mechanism can greatly increase throughput to and from a device, because a great deal ofcomputational overhead is eliminated.To exploit the DMA capabilities of its hardware, the device driver needs to be ableto correctly set up the DMA transfer and synchronize with the hardware.
Unfortunately, because of its hardware nature, DMA is very system dependent. Each architecture has its own techniques to manage DMA transfers, and the programminginterface is different for each. The kernel can’t offer a unified interface, either,because a driver can’t abstract too much from the underlying hardware mechanisms. Some steps have been made in that direction, however, in recent kernels.This chapter concentrates mainly on the PCI bus, since it is currently the mostpopular peripheral bus available. Many of the concepts are more widely applicable, though. We also touch on how some other buses, such as ISA and SBus, handle DMA.Overview of a DMA Data TransferBefore introducing the programming details, let’s review how a DMA transfer takesplace, considering only input transfers to simplify the discussion.Data transfer can be triggered in two ways: either the software asks for data (via afunction such as read) or the hardware asynchronously pushes data to the system.In the first case, the steps involved can be summarized as follows:1.When a process calls read, the driver method allocates a DMA buffer andinstructs the hardware to transfer its data.
The process is put to sleep.2.The hardware writes data to the DMA buffer and raises an interrupt when it’sdone.3.The interrupt handler gets the input data, acknowledges the interrupt, andawakens the process, which is now able to read data.The second case comes about when DMA is used asynchronously. This happens,for example, with data acquisition devices that go on pushing data even if nobodyis reading them.
In this case, the driver should maintain a buffer so that a subsequent read call will return all the accumulated data to user space. The stepsinvolved in this kind of transfer are slightly different:40122 June 2001 16:42http://openlib.org.uaChapter 13: mmap and DMA1.The hardware raises an interrupt to announce that new data has arrived.2.The interrupt handler allocates a buffer and tells the hardware where to transfer its data.3.The peripheral device writes the data to the buffer and raises another interruptwhen it’s done.4.The handler dispatches the new data, wakes any relevant process, and takescare of housekeeping.A variant of the asynchronous approach is often seen with network cards.
Thesecards often expect to see a circular buffer (often called a DMA ring buffer) established in memory shared with the processor; each incoming packet is placed inthe next available buffer in the ring, and an interrupt is signaled. The driver thenpasses the network packets to the rest of the kernel, and places a new DMA bufferin the ring.The processing steps in all of these cases emphasize that efficient DMA handlingrelies on interrupt reporting.
While it is possible to implement DMA with a pollingdriver, it wouldn’t make sense, because a polling driver would waste the performance benefits that DMA offers over the easier processor-driven I/O.Another relevant item introduced here is the DMA buffer. To exploit direct memory access, the device driver must be able to allocate one or more special buffers,suited to DMA. Note that many drivers allocate their buffers at initialization timeand use them until shutdown—the word allocate in the previous lists thereforemeans ‘‘get hold of a previously allocated buffer.’’Allocating the DMA BufferThis section covers the allocation of DMA buffers at a low level; we will introducea higher-level interface shortly, but it is still a good idea to understand the materialpresented here.The main problem with the DMA buffer is that when it is bigger than one page, itmust occupy contiguous pages in physical memory because the device transfersdata using the ISA or PCI system bus, both of which carry physical addresses.
It’sinteresting to note that this constraint doesn’t apply to the SBus (see ‘‘SBus’’ inChapter 15), which uses virtual addresses on the peripheral bus. Some architectures can also use virtual addresses on the PCI bus, but a portable driver cannotcount on that capability.Although DMA buffers can be allocated either at system boot or at runtime, modules can only allocate their buffers at runtime. Chapter 7 introduced these techniques: ‘‘Boot-Time Allocation’’ talked about allocation at system boot, while ‘‘TheReal Story of kmalloc’’ and ‘‘get_free_page and Friends’’ described allocation at40222 June 2001 16:42http://openlib.org.uaDirect Memory Access and Bus Masteringruntime.
Driver writers must take care to allocate the right kind of memory when itwill be used for DMA operations—not all memory zones are suitable. In particular,high memory will not work for DMA on most systems—the peripherals simplycannot work with addresses that high.Most devices on modern buses can handle 32-bit addresses, meaning that normalmemory allocations will work just fine for them. Some PCI devices, however, failto implement the full PCI standard and cannot work with 32-bit addresses. AndISA devices, of course, are limited to 16-bit addresses only.For devices with this kind of limitation, memory should be allocated from theDMA zone by adding the GFP_DMA flag to the kmalloc or get_fr ee_pages call.When this flag is present, only memory that can be addressed with 16 bits will beallocated.Do-it-yourself allocationWe have seen how get_fr ee_pages (and therefore kmalloc) can’t return more than128 KB (or, more generally, 32 pages) of consecutive memory space.
But therequest is prone to fail even when the allocated buffer is less than 128 KB,because system memory becomes fragmented over time.*When the kernel cannot return the requested amount of memory, or when youneed more than 128 KB (a common requirement for PCI frame grabbers, for example), an alternative to returning -ENOMEM is to allocate memory at boot time orreserve the top of physical RAM for your buffer. We described allocation at boottime in ‘‘Boot-Time Allocation’’ in Chapter 7, but it is not available to modules.Reserving the top of RAM is accomplished by passing a mem= argument to the kernel at boot time.
For example, if you have 32 MB, the argument mem=31M keepsthe kernel from using the top megabyte. Your module could later use the following code to gain access to such memory:dmabuf = ioremap( 0x1F00000 /* 31M */, 0x100000 /* 1M */);Actually, there is another way to allocate DMA space: perform aggressive allocation until you are able to get enough consecutive pages to make a buffer. Westrongly discourage this allocation technique if there’s any other way to achieveyour goal. Aggressive allocation results in high machine load, and possibly in asystem lockup if your aggressiveness isn’t correctly tuned.
On the other hand,sometimes there is no other way available.In practice, the code invokes kmalloc(GFP_ATOMIC) until the call fails; it thenwaits until the kernel frees some pages, and then allocates everything once again.* The word fragmentation is usually applied to disks, to express the idea that files are notstored consecutively on the magnetic medium.
The same concept applies to memory,where each virtual address space gets scattered throughout physical RAM, and it becomesdifficult to retrieve consecutive free pages when a DMA buffer is requested.40322 June 2001 16:42http://openlib.org.uaChapter 13: mmap and DMAIf you keep an eye on the pool of allocated pages, sooner or later you’ll find thatyour DMA buffer of consecutive pages has appeared; at this point you can releaseevery page but the selected buffer. This kind of behavior is rather risky, though,because it may lead to a deadlock. We suggest using a kernel timer to releaseevery page in case allocation doesn’t succeed before a timeout expires.We’re not going to show the code here, but you’ll find it in misc-modules/allocator.c; the code is thoroughly commented and designed to be called by other modules.
Unlike every other source accompanying this book, the allocator is coveredby the GPL. The reason we decided to put the source under the GPL is that it isneither particularly beautiful nor particularly clever, and if someone is going to useit, we want to be sure that the source is released with the module.Bus AddressesA device driver using DMA has to talk to hardware connected to the interface bus,which uses physical addresses, whereas program code uses virtual addresses.As a matter of fact, the situation is slightly more complicated than that.
DMA-basedhardware uses bus, rather than physical, addresses. Although ISA and PCIaddresses are simply physical addresses on the PC, this is not true for every platform. Sometimes the interface bus is connected through bridge circuitry that mapsI/O addresses to different physical addresses.