Volume 1 Basic Architecture (794100), страница 17
Текст из файла (страница 17)
SSE2integer instructions extend IA-32 SIMD operations by adding new 128-bit SIMDinteger operations and by expanding existing 64-bit SIMD integer operations to128-bit XMM capability. SSE2 instructions also provide new cache control andmemory ordering operations.SSE3 extensions were introduced with the Pentium 4 processor supporting HyperThreading Technology (built on 90 nm process technology). SSE3 offers 13 instructions that accelerate performance of Streaming SIMD Extensions technology,Streaming SIMD Extensions 2 technology, and x87-FP math capabilities.SSSE3 extensions were introduced with the Intel Xeon processor 5100 series andIntel Core 2 processor family.
SSSE3 offers 32 instructions to accelerate processingof SIMD integer data.Intel 64 architecture allows four generations of 128-bit SIMD extensions to access upto 16 XMM registers. IA-32 architecture provides 8 XMM registers.See also:•Section 5.4, “MMX™ Instructions,” and Chapter 9, “Programming with Intel®MMX™ Technology”•Section 5.5, “SSE Instructions,” and Chapter 10, “Programming with StreamingSIMD Extensions (SSE)”•Section 5.6, “SSE2 Instructions,” and Chapter 11, “Programming with StreamingSIMD Extensions 2 (SSE2)”•Section 5.7, “SSE3 Instructions,” and Chapter 12, “Programming with SSE3 andSupplemental SSE3”2-16 Vol. 1INTEL® 64 AND IA-32 ARCHITECTURESSIMD ExtensionRegister LayoutData TypeMMX Registers8 Packed Byte IntegersMMX Technology4 Packed Word Integers2 Packed Doubleword IntegersQuadwordMMX RegistersSSE8 Packed Byte Integers4 Packed Word Integers2 Packed Doubleword IntegersQuadwordXMM Registers4 Packed Single-PrecisionFloating-Point ValuesMMX RegistersSSE2/SSE3/SSSE32 Packed Doubleword IntegersQuadwordXMM Registers2 Packed Double-PrecisionFloating-Point Values16 Packed Byte Integers8 Packed Word Integers4 Packed DoublewordIntegers2 Quadword IntegersDouble QuadwordFigure 2-4.
SIMD Extensions, Register Layouts, and Data TypesVol. 1 2-17INTEL® 64 AND IA-32 ARCHITECTURES2.2.5Hyper-Threading TechnologyHyper-Threading (HT) Technology was developed to improve the performance ofIA-32 processors when executing multi-threaded operating system and applicationcode or single-threaded applications under multi-tasking environments. The technology enables a single physical processor to execute two or more separate codestreams (threads) concurrently using shared execution resources.HT Technology is one form of hardware multi-threading capability in IA-32 processorfamilies. It differs from multi-processor capability using separate physically distinctpackages with each physical processor package mated with a physical socket.HT Technology provides hardware multi-threading capability with a single physicalpackage by using shared execution resources in a processor core.Architecturally, an IA-32 processor that supports HT Technology consists of two ormore logical processors, each of which has its own IA-32 architectural state.
Eachlogical processor consists of a full set of IA-32 data registers, segment registers,control registers, debug registers, and most of the MSRs. Each also has its ownadvanced programmable interrupt controller (APIC).Figure 2-5 shows a comparison of a processor that supports HT Technology (implemented with two logical processors) and a traditional dual processor system.IA-32 Processor SupportingHyper-Threading TechnologyASTraditional Multiple Processor (MP) SystemASASASProcessor CoreProcessor CoreProcessor CoreIA-32 processorIA-32 processorIA-32 processorTwo logicalprocessors that sharea single coreEach processor is aseparate physicalpackageAS = IA-32 Architectural StateOM16522Figure 2-5.
Comparison of an IA-32 Processor Supporting Hyper-ThreadingTechnology and a Traditional Dual Processor SystemUnlike a traditional MP system configuration that uses two or more separate physicalIA-32 processors, the logical processors in an IA-32 processor supporting HT Technology share the core resources of the physical processor. This includes the execution2-18 Vol. 1INTEL® 64 AND IA-32 ARCHITECTURESengine and the system bus interface. After power up and initialization, each logicalprocessor can be independently directed to execute a specified thread, interrupted,or halted.HT Technology leverages the process and thread-level parallelism found in contemporary operating systems and high-performance applications by providing two ormore logical processors on a single chip.
This configuration allows two or morethreads1 to be executed simultaneously on each a physical processor. Each logicalprocessor executes instructions from an application thread using the resources in theprocessor core. The core executes these threads concurrently, using out-of-orderinstruction scheduling to maximize the use of execution units during each clock cycle.2.2.5.1Some Implementation NotesAll HT Technology configurations require:•••A processor that supports HT TechnologyA chipset and BIOS that utilize the technologyOperating system optimizationsSee http://www.intel.com/products/ht/hyperthreading_more.htm for information.At the firmware (BIOS) level, the basic procedures to initialize the logical processorsin a processor supporting HT Technology are the same as those for a traditional DP orMP platform. The mechanisms that are described in the Multiprocessor Specification,Version 1.4 to power-up and initialize physical processors in an MP system also applyto logical processors in a processor that supports HT Technology.An operating system designed to run on a traditional DP or MP platform may useCPUID to determine the presence of hardware multi-threading support feature andthe number of logical processors they provide.Although existing operating system and application code should run correctly on aprocessor that supports HT Technology, some code modifications are recommendedto get the optimum benefit.
These modifications are discussed in Chapter 7,“Multiple-Processor Management,” Intel® 64 and IA-32 Architectures SoftwareDeveloper’s Manual, Volume 3A.2.2.6Multi-Core TechnologyMulti-core technology is another form of hardware multi-threading capability in IA-32processor families. Multi-core technology enhances hardware multi-threading capability by providing two or more execution cores in a physical package.The Intel Pentium processor Extreme Edition is the first member in the IA-32processor family to introduce multi-core technology. The processor provides hard1. In the remainder of this document, the term “thread” will be used as a general term for the terms“process” and “thread.”Vol. 1 2-19INTEL® 64 AND IA-32 ARCHITECTURESware multi-threading support with both two processor cores and Hyper-ThreadingTechnology.
This means that the Intel Pentium processor Extreme Edition providesfour logical processors in a physical package (two logical processors for eachprocessor core). The Dual-Core Intel Xeon processor features multi-core, HyperThreading Technology and supports multi-processor platforms.The Intel Pentium D processor also features multi-core technology. This processorprovides hardware multi-threading support with two processor cores but does notoffer Hyper-Threading Technology. This means that the Intel Pentium D processorprovides two logical processors in a physical package, with each logical processorowning the complete execution resources of a processor core.The Intel Core 2 processor family, Intel Xeon processor 3000 and 5100 series, andIntel Core Duo processor offer power-efficient multi-core technology.
The processorcontains two cores that share a smart second level cache. The Level 2 cache enablesefficient data sharing between two cores to reduce memory traffic to the system bus.,QWHO&RUH'XR3URFHVVRU,QWHO&RUH'XR3URFHVVRU3HQWLXP'3URFHVVRU$UFKLWHFWXDO6WDWH$UFKLWHFWXDO6WDWH$UFKLWHFWXDO6WDWH$UFKLWHFWXDO6WDWH([HFXWLRQ(QJLQH([HFXWLRQ(QJLQH([HFXWLRQ(QJLQH([HFXWLRQ(QJLQH/RFDO$3,&/RFDO$3,&/RFDO$3,&/RFDO$3,&6HFRQG/HYHO&DFKH6HFRQG/HYHO&DFKH6HFRQG/HYHO&DFKH%XV,QWHUIDFH%XV,QWHUIDFH%XV,QWHUIDFH6\VWHP%XV6\VWHP%XV3HQWLXP3URFHVVRU([WUHPH(GLWLRQ$UFK6WDWH$UFK6WDWH([HFXWLRQ(QJLQH/RFDO$3,&/RFDO$3,&$UFK6WDWH$UFK6WDWH([HFXWLRQ(QJLQH/RFDO$3,&/RFDO$3,&6HFRQG/HYHO&DFKH6HFRQG/HYHO&DFKH%XV,QWHUIDFH%XV,QWHUIDFH6\VWHP%XV20Figure 2-6.
Intel 64 and IA-32 Processors that Support Dual-Core2-20 Vol. 1INTEL® 64 AND IA-32 ARCHITECTURESThe Intel Xeon processor 5300 and 3200 series, Intel Core 2 Extreme Quad-coreprocessor, and Intel Core 2 Quad processors support Intel quad-core technology. TheQuad-core Intel Xeon processors and the Quad-core Intel Core 2 processor family arealso in Figure 2-7.Intel Core 2 Extreme Quad-core ProcessorIntel Core 2 Quad ProcessorIntel Xeon Processor 3200 SeriesIntel Xeon Processor 5300 SeriesArchitectual StateArchitectual StateArchitectual StateArchitectual StateExecution EngineExecution EngineExecution EngineExecution EngineLocal APICLocal APICLocal APICLocal APICSecond Level CacheSecond Level CacheBus InterfaceBus InterfaceSystem BusOM19810Figure 2-7.
Intel 64 Processors that Support Quad-Core2.2.7Intel® 64 ArchitectureIntel 64 architecture increases the linear address space for software to 64 bits andsupports physical address space up to 40 bits. The technology also introduces a newoperating mode referred to as IA-32e mode.IA-32e mode operates in one of two sub-modes: (1) compatibility mode enables a64-bit operating system to run most legacy 32-bit software unmodified, (2) 64-bitmode enables a 64-bit operating system to run applications written to access 64-bitaddress space.In the 64-bit mode, applications may access:•64-bit flat linear addressingVol. 1 2-21INTEL® 64 AND IA-32 ARCHITECTURES••8 additional general-purpose registers (GPRs)••••64-bit-wide GPRs and instruction pointers8 additional registers for streaming SIMD extensions (SSE, SSE2, SSE3 andSSSE3)uniform byte-register addressingfast interrupt-prioritization mechanisma new instruction-pointer relative-addressing modeAn Intel 64 architecture processor supports existing IA-32 software because it is ableto run all non-64-bit legacy modes supported by IA-32 architecture.