Real-Time Systems. Design Principles for Distributed Embedded Applications. Herman Kopetz. Second Edition (811374), страница 59
Текст из файла (страница 59)
(Siliconhas density of 2.33 g/cm3, and a specific heat of 0.7 J/g C.) The execution of the programof the above hypothetical example – generating a heat of 1 J – would thus lead to atemperature rise of about 1,200 C and result in a hot spot on the die. In reality, such atemperature rise will not occur, since any temperature difference between two bodiesforces a heat-flow from the warmer body to the colder body that reduces the temperaturedifference.In a VLSI device, the heat flow from a hot spot to the chip’s environment takesplace in two steps.
In a first step, the heat of the hot spot flows to the die andincreases the temperature of the entire die. In a second step the heat flows from thedie through the package to the environment. Taking the above example further letus assume that the complete die is thermally isolated. Then the execution of theabove program, which dissipates 1 J, will lead to an increase of the die temperatureby about 12 C.The heat flow from the hot spot to the entire die and the exact temperature profileof the die can be calculated by solving the partial differential equation for heattransfer.
In order to get some gross insight into this phenomenon of heat transferfrom the hot spot to the die, we look at a simple heat flow model where a bar of across-section A and length l is connecting a heat source with temperature Tsourcewith a heat sink with temperature Tsink. The stationary heat flow Pheat across the barbetween these two bodies can then expressed byPheat ¼ HY AðTsource Tsink Þ=lwhere HY is the heat conductivity of the bar.Example: Let us assume a bar of silicon with a cross section of 1 mm2 and length of10 mm is linking the heat source with the heat sink. The thermal conductivity HY of siliconis 150 W/m C.
If the temperature difference between the heat source (the hot spot) and theheat sink (the rest of the die) is 33 C then a steady heat flow of about 500 mW will develop(this is the dynamic power dissipated by the execution of the program in the aboveexample). If the bar has a cross section of 0.25 mm2, then the temperature difference fortransporting 500 mW must be 132 C. In reality, the temperature difference will be smaller,since the hot spot is much better embedded in the substrate of the die than expressed by thissimple heat-flow model.What can we learn from this simple example? Hot spots will only develop if asubstantial amount of power is dissipated in a very small area.
For example, let usassume that a temporary low impedance path between the power rails of a transistor, alatch-up (which can be corrected by a power-cycle) develops in a small part of a circuitdue to a fault caused by a neutron from ambient cosmic radiation. The current thatdissipates in this path will result in a hot spot that can physically destroy the circuit.It is therefore expedient to monitor the current drawn by a device and switch off thepower quickly (e.g., within less than 10 ms) if an unexpected current surge is observed.The temperature difference that develops between the die and the environmentis determined by the power dissipation in the die and the thermal conductivity8.1 Power and Energy199Hpackage of the package.
This thermal conductivity Hpackage of a typical chippackage is between 0.1 and 1 W/ C, depending on package geometry, size andmaterial. Plastic packages have a significantly lower thermal conductivity thanceramic packages. The temperature difference DT between the environment andthe die can be calculated byDT ¼ Pdie =Hpackagewhere Pdie is the total power dissipated in the die. If the heat flow through thepackage is more than 10 W, then a fan should cool the package.
The introduction offans has a number of disadvantages, such as the additional energy required tooperate the fan, the noise of the fan and the reliability of the mechanical fan. If afan fails, overheating might destroy the circuit.A high substrate temperature has a negative effect on the reliability of a deviceand can cause transient and permanent failures. High substrate temperatures changethe timing parameters of the transistors and the circuits. If the specified timingpatterns are violated, transient and data-dependent device failures will occur.The Arrhenius equation gives a gross estimate for the acceleration of the failurerate caused by an increase of the temperature of the silicon substrate:Ea11AF ¼ expk T normal T highwhere AF is the acceleration factor of the failure rate, k is the Boltzmann constant(8.617 105 eV/K), Tnormal is the normal substrate temperature (expressed inKelvin), Thigh is the high substrate temperature (expressed in Kelvin), and Ea is afailure-mechanism-specific activation energy (see Table 8.1).From this equation we can deduce that the failure rate of a device increasesexponentially with the increase of the substrate temperature of the device.Example: If the temperature of the substrate of a device increases from 50 C (i.e., 323K)to 100 C (i.e., 373K), and a failure mechanism with an activation energy of 0.5 eV isassumed, then the failure rate of the device will increase by a factor of about 11.Table 8.1 Activation energy for different failure mechanisms (Adaptedfrom [Vig10])Failure mechanismActivation energy Ea (eV)Oxide defects, bulk silicon defects0.3–0.5Corrosion0.45Assembly defects0.5–0.7Electromigration0.6–0.9Mask defects/photoresist defects0.7Contamination1.0Charge injection1.32008.28.2.18 Power and Energy AwarenessHardware Power Reduction TechniquesDevice ScalingThe most effective way to reduce the power consumption of CMOS devices isthe scaling of the device parameters, i.e., making the transistors smaller [Fra01].Table 8.2 depicts the effect of ideal scaling on the different parameters of a CMOSdevice.
Thepscalingfactor a from one micro-electronic generation to the next isffiffiffinormally 1= 2, i.e., about 0.7, such that the area of a scaled version of a design isreduced by a factor of 2,ptheffiffiffi power requirement is reduced by a factor of 2, the speedis increased by a factor 2 and the energypneededfor the execution of an instructionffiffiffi(the energy performance) is reduced by 2 2. Note from Table 8.2 that ideal devicescaling has no effect on the power density that is dissipated in a given area of thedie.
It follows that ideal scaling will not result in a temperature increase of the die.pffiffiffiExample: Let us assume that an IP-core scales down ideally by a factor of 1/ 2 every2 years. At the start, the IP-core has a size of 16 mm2 and executes 125 MIPS, consuming apower of 16 W. Eight years later, after four generations of shrinking, this IP-core has a sizeof 1 mm2, executes 500 MIPS and consumes a power of 1 W. The energy needed for theexecution of an instruction has been reduced by a factor of 64, while the time performancehas increased by a factor of four.Device scaling has made it possible to place up to one billion transistors on asingle die.
It is thus within the capabilities of the semi-conductor industry to placea complete system, including processor, memory, and input/output circuitry on asingle die, resulting in a system-on-chip (SoC). Spatial and temporal closeness ofsubsystems that are involved in a computation leads to a significant improvementof the energy efficiency. Spatial locality reduces the effective capacitances ofthe switching actions, which implies lower energy needs and faster operations.Temporal locality reduces the number of cache misses.
If subsystems residing ondifferent chips are integrated on a single die, the significant amount of energyneeded to exchange data and control among chips can be saved.Example: According to Intel [Int09], the 1996 design of the first teraflop super computer,consisting of 10,000 Pentium Pro Processors, operated with an energy efficiency of2MegaFlops/J or 500 nJ per instruction. Ten years later, in 2006, a teraflop research chipTable 8.2 The effect of ideal device scaling on device parametersPhysical parameterScaling factorChannel length, oxide thickness, wiring widthaElectric field in device1VoltageaCapacitanceaRC delayaPower dissipationa2Power density1Time performance in MIPS1/aEnergy performance1/a38.2 Hardware Power Reduction Techniques201of Intel containing 80 IP-cores on a single die connected by a Network-on-Chip achieved anenergy efficiency of 16,000 MegaFlops/J or 62 pJ/instruction.
This is an increase in the energyperformance by a factor of 8,000 within 10 years. If we assume that in 10 years five generationsof scaling are takingpffiffiffi place, the increase in the energy-performance in one generation is notonly factor of 2 2, the value stipulated by ideal scaling, but by a factor of more than 4. Thisadditional improvement is caused by the integration of all subsystems on a single die.Over the last 25 years, device scaling has also had a very beneficial effect on devicereliability.
The failure rates of transistors have been reduced even faster than theincrease in the number of transistors on a die, resulting in an increase in chipreliability despite the fact that many more transistors are contained in a scaled chip.The MTTF w.r.t. permanent failures of industrial state-of-the art chips is significantly lower than 100 FIT [Pau98].Scaling cannot continue indefinitely because there are limits due to the discretestructure of matter and quantum mechanical effects, such as electron tunneling. Thereduction in the number of dopants in a transistor increases the statistical variations.The thermal energy of noise limits the reduction of the supply voltage.
If the supplyvoltage is higher than stipulated by ideal scaling, then scaling will lead to anincreased thermal stress. In submicron technologies, we have reached the pointwhere these effects cannot be neglected any more and lead to an increase in thetransient failure rates of chips. Although the permanent failure rate per transistormay still decrease, this decrease is not compensating the increase in the number oftransistors on the die anymore, causing an increase of the chip failure rate. TheInternational Technology Roadmap on Semiconductors 2009 [ITR09, p. 15] summarizes these challenges in a single sentence: The ITRS is entering a new era as theindustry begins to address the theoretical limits of CMOS scaling.8.2.2Low-Power Hardware DesignOver the past few years, a number of hardware-design techniques have beendeveloped that help to reduce the power needs of VLSI circuits [Kea07]. In thissection we give a very short overview of some of these techniques.Clock Gating.