Volume 1 Basic Architecture (794100), страница 98
Текст из файла (страница 98)
1 INDEX-13INDEX64-bit mode, 10-464-bit SIMD integer instructions, 10-16branching on arithmetic operations, 11-36cacheability control instructions, 10-18cacheability hint instructions, 11-36caller-save requirement for procedure andfunction calls, 11-35checking for SSE and SSE2 support, 11-28comparison instructions, 10-13compatibility mode, 10-4compatibility of SIMD and x87 FPU floating-pointdata types, 11-32conversion instructions, 10-15data movement instructions, 10-11data types, 10-8, 12-1denormal operand exception (#D), 11-21denormals-are-zeros mode, 10-7divide by zero exception (#Z), 11-22exceptions, 11-18floating-point format, 4-13, 4-14flush-to-zero mode, 10-7generating SIMD FP exceptions, 11-23guidelines for using, 11-27handling combinations of masked and unmaskedexceptions, 11-26handling masked exceptions, 11-23handling SIMD floating-point exceptions insoftware, 11-26handling unmasked exceptions, 11-25, 11-26inexact result exception (#P), 11-23instruction prefixes, effect on SSE and SSE2instructions, 11-37instruction set, 5-16, 10-9interaction of SIMD and x87 FPU floating-pointexceptions, 11-26interaction of SSE and SSE2 instructions with x87FPU and MMX instructions, 11-31interfacing with SSE and SSE2 procedures andfunctions, 11-34intermixing packed and scalar floating-pointand 128-bit SIMD integer instructionsand data, 11-32introduction, 2-4invalid operation exception (#I), 11-20logical instructions, 10-13masked responses to invalid arithmeticoperations, 11-20memory ordering instruction, 10-20MMX technology compatibility, 10-8MXCSR register, 10-5MXCSR state management instructions, 10-17non-temporal data, operating on, 10-18numeric overflow exception (#O), 11-22numeric underflow exception (#U), 11-22overview, 10-1packed 128-Bit SIMD data types, 4-10packed and scalar floating-point instructions, 10-9programming environment, 10-3INDEX-14 Vol.
1QNaN floating-point indefinite, 4-22restoring SSE and SSE2 state, 11-30REX prefixes, 10-4saving SSE and SSE2 state, 11-30saving XMM register state on a procedure orfunction call, 11-34shuffle instructions, 10-14SIMD floating-point exception conditions, 11-19SIMD floating-point exception cross reference,C-4SIMD floating-point exception (#XM), 11-25,11-26SIMD floating-point exceptions, 11-19SIMD floating-point mask and flag bits, 10-6SIMD floating-point rounding control field, 10-7SSE and SSE2 conversion instruction chart, 11-13SSE feature flag, CPUID instruction, 11-28SSE2 compatibility, 10-8unpack instructions, 10-14updating MMX technology routinesusing128-bit SIMD integer instructions, 11-35x87 FPU compatibility, 10-8XMM registers, 10-4SSE feature flag, CPUID instruction, 11-28, 12-7SSE instructionsdescriptions of, 10-9SIMD floating-point exception cross-reference,C-4summary of, 5-16SSE2 extensions128-bit packed single-precisiondata type, 11-4128-bit packed single-precision data type, 12-2128-bit SIMD integer instructionextensions, 11-1664-bit and 128-bit SIMD integer instructions,11-1564-bit mode, 11-4arithmetic instructions, 11-8branch hints, 11-18branching on arithmetic operations, 11-36cacheability control instructions, 11-17cacheability hint instructions, 11-36caller-save requirement for procedure andfunction calls, 11-35checking for SSE and SSE2 support, 11-28comparison instructions, 11-9compatibility mode, 11-4compatibility of SIMD and x87 FPU floating-pointdata types, 11-32conversion instructions, 11-12data movement instructions, 11-7data types, 11-4, 11-5, 12-2denormal operand exception (#D), 11-21denormals-are-zero mode, 11-4divide by zero exception (#Z), 11-22exceptions, 11-18floating-point format, 4-13, 4-14INDEXgenerating SIMD floating-point exceptions, 11-23guidelines for using, 11-27handling combinations of masked and unmaskedexceptions, 11-26handling masked exceptions, 11-23handling SIMD floating-point exceptions insoftware, 11-26handling unmasked exceptions, 11-25, 11-26inexact result exception (#P), 11-23initialization of, 11-29instruction prefixes, effect on SSE and SSE2instructions, 11-37instruction set, 5-20instructions, 11-6, 12-3, 12-9interaction of SIMD and x87 FPU floating-pointexceptions, 11-26interaction of SSE and SSE2 instructions with x87FPU and MMX instructions, 11-31interfacing with SSE and SSE2 procedures andfunctions, 11-34intermixing packed and scalar floating-point and128-bit SIMD integer instructions and data,11-32invalid operation exception (#I), 11-20logical instructions, 11-9masked responses to invalid arithmeticoperations, 11-20memory ordering instructions, 11-17MMX technology compatibility, 11-4numeric overflow exception (#O), 11-22numeric underflow exception (#U), 11-22overview of, 11-1packed 128-Bit SIMD data types, 4-10packed and scalar floating-point instructions, 11-6programming environment, 11-3QNaN floating-point indefinite, 4-22restoring SSE and SSE2 state, 11-30REX prefixes, 11-4saving SSE and SSE2 state, 11-30saving XMM register state on a procedure orfunction call, 11-34shuffle instructions, 11-10SIMD floating-point exception conditions, 11-19SIMD floating-point exception cross reference,C-7SIMD floating-point exception (#XM), 11-25,11-26SIMD floating-point exceptions, 11-19SSE and SSE2 conversion instruction chart, 11-13SSE compatibility, 11-4SSE2 feature flag, CPUID instruction, 11-28unpack instructions, 11-10updating MMX technology routines using 128-bitSIMD integer instructions, 11-35writing applications with, 11-27x87 FPU compatibility, 11-4SSE2 feature flag, CPUID instruction, 11-28, 12-7SSE2 instructionsdescriptions of, 11-6, 12-3, 12-9SIMD floating-point exception cross-reference,C-7summary of, 5-20SSE3 extensions64-bit mode, 12-1asymmetric processing, 12-2compatibility mode, 12-1DNA exceptions, 12-13emulation, 12-14enabling support in a system executive, 12-7exceptions, 12-13guideline for packed addition/subtractioninstructions, 12-8horizontal addition/subtraction instructions, 12-5horizontal processing, 12-2instruction that addresses cache line splits, 5-25instruction that improves X87-FP integerconversion, 5-25instructions for horizontal addition/subtraction,5-26instructions for packed addition/subtraction, 5-26instructions that enhanceLOAD/MOVE/DUPLICATE, 5-26instructions that improve synchronizationbetween agents, 5-27LOAD/MOVE/DUPLICATE enhancementinstructions, 12-4MMX technology compatibility, 12-2numeric error flag and IGNNE#, 12-13packed addition/subtraction instructions, 12-5programming environment, 12-1REX prefixes, 12-1SIMD floating-point exception cross reference,C-11specialized 120-bit load instruction, 12-4SSE compatibility, 12-2SSE2 compatibility, 12-2x87 FPU compatibility, 12-2SSE3 instructionsdescriptions of, 12-3SIMD floating-point exceptioncross-reference, C-11summary of, 5-25SSSE3 extensions64-bit mode, 12-1asymmetric processing, 12-2checking for support, 12-13compatibility, 12-2compatibility mode, 12-1data types, 12-1DNA exceptions, 12-13emulation, 12-14enabling support in a system executive, 12-12exceptions, 12-13horizontal add/subtract instructions, 12-9horizontal processing, 12-2MMX technology compatibility, 12-2Vol.
1 INDEX-15INDEXmultiply and add packed instructions, 12-11numeric error flag and IGNNE#, 12-13packed absolute value instructions, 12-11packed align instruction, 12-12packed multiply high instructions, 12-11packed shuffle instruction, 12-12programming environment, 12-1SSSE2 compatibility, 12-2x87 FPU compatibility, 12-2SSSE3 instructionsdescriptions of, 12-8summary of, 5-27Stack64-bit mode, 3-6, 6-564-bit mode behavior, 6-19address-size attribute, 6-3alignment, 6-3alignment of stack pointer, 6-3current stack, 6-2, 6-4description of, 6-1EIP register (return instruction pointer), 6-4maximum size, 6-1number allowed, 6-1overview of, 3-5passing parameters on, 6-7popping values from, 6-1procedure linking information, 6-4pushing values on, 6-1return instruction pointer, 6-4SS register, 6-1stack segment, 3-19, 6-1stack-frame base pointer, EBP register, 6-4switchingon calls to interrupt and exception handlers,6-15on inter-privilege level calls, 6-11, 6-16privilege levels, 6-10width, 6-3Stack, x87 FPUstack fault, 8-9stack overflow and underflow exception (#IS),8-7, 8-36, 8-37Status flagsEFLAGS register, 3-21, 8-9, 8-10, 8-28STC instruction, 3-22, 7-29STD instruction, 3-22, 7-30STI instruction, 7-31, 13-5Sticky bits, 8-7STMXCSR instruction, 10-17, 11-34STOS instruction, 3-22, 7-27Streaming SIMD extensions 2 (see SSE2 extensions)Streaming SIMD extensions (see SSE extensions)String data type, 4-9ST(0), top-of-stack register, 8-4SUB instruction, 7-12Superscalar microarchitectureP6 family microarchitecture, 2-3P6 family processors, 2-7INDEX-16 Vol.
1Pentium 4 processor, 2-10Pentium Pro processor, 2-3Pentium processor, 2-2System management mode (see SMM)TTangent, x87 FPU operation, 8-29Task gate, 6-17Task register, 3-5Task state segment (see TSS)Tasksexception handler, 6-17interrupt handler, 6-17Temporal data, 10-18TEST instruction, 7-21TF (trap) flag, EFLAGS register, 3-23, A-1Thermal Monitor, 2-6Tiny number, 4-18TOP (stack TOP) fieldx87 FPU status word, 8-3, 9-12TR register, 3-6Trace cache, 2-10Transcendental instruction accuracy, 8-31Trap gate, 6-14Truncationdescription of, 4-24with SSE-SSE2 conversion instructions, 4-24TSSI/O map base, 13-5I/O permission bit map, 13-5saving state of EFLAGS register, 3-20UUCOMISD instruction, 11-10UCOMISS instruction, 10-14UD2 instruction, 7-33UE (numeric underflow exception) flagMXCSR register, 11-22x87 FPU status word, 8-7, 8-42UM (numeric underflow exception) mask bitMXCSR register, 11-22x87 FPU control word, 8-11, 8-42UnderflowFPU exception(see Numeric underflow exception)numeric, floating-point, 4-18x87 FPU stack, 8-36, 8-37Underflow, x87 FPU stack, 8-37Unpack instructionsSSE extensions, 10-14SSE2 extensions, 11-10UNPCKHPD instruction, 11-11UNPCKHPS instruction, 10-15UNPCKLPD instruction, 11-11UNPCKLPS instruction, 10-15Unsigned integersINDEXdescription of, 4-4range of, 4-4types, 4-4Unsupported, 8-20floating-point formats, x87 FPU, 8-20x87 FPU instructions, 8-34VVector (see Interrupt vector)VIF (virtual interrupt) flag, EFLAGS register, 3-23VIP (virtual interrupt pending) flagEFLAGS register, 3-23Virtual 8086 modedescription of, 3-23memory model, 3-9, 3-10VM (virtual 8086 mode) flag, EFLAGS register, 3-23VMCALL instruction, 5-31VMCLEAR instruction, 5-31VMLAUNCH instruction, 5-31VMPTRLD instruction, 5-31VMPTRST instruction, 5-31VMREAD instruction, 5-31VMRESUME instruction, 5-31VMWRITE instruction, 5-31VMXinstruction set, 5-31introduction, 2-22Virtual machine monitor (VMM), 2-22virtualization, 2-22VMXOFF instruction, 5-31VMXON instruction, 5-31WWaiting instructions, x87 FPU, 8-33WAIT/FWAIT instructions, 8-33, 8-44WC memory type, 10-18wide dynamic execution, 2-6Word, 4-1Wraparound mode (MMX instructions), 9-5Xx87 FPU64-bit mode, 8-2compatibility mode, 8-2control word, 8-10data pointer, 8-13data registers, 8-2execution environment, 8-1floating-point data types, 8-17floating-point format, 4-13, 4-14fopcode compatibility mode, 8-14FXSAVE and FXRSTOR instructions, 11-34IEEE Standard 754, 8-1instruction pointer, 8-13instruction set, 8-21last instruction opcode, 8-14overview of registers, 3-3programming, 8-1QNaN floating-point indefinite, 4-22register stack, 8-2register stack, parameter passing, 8-5registers, 8-1save and restore state instructions, 5-13saving registers, 11-34state, 8-15state, image, 8-16, 8-17state, saving, 8-15, 8-17status register, 8-6tag word, 8-12transcendental instruction accuracy, 8-31x87 FPU control worddescription of, 8-10exception-flag mask bits, 8-11infinity control flag, 8-12precision control (PC) field, 8-11rounding control (RC) field, 4-23, 8-12x87 FPU exception handlingdescription of, 8-45floating-point exception summary, C-2MS-DOS compatibility mode, 8-45native mode, 8-45x87 FPU floating-point exceptionsdenormal operand exception, 8-39division-by-zero, 8-40exception conditions, 8-36exception summary, C-2guidelines for writing exception handlers, D-1inexact-result (precision), 8-42interaction of SIMD and x87 FPU floating-pointexceptions, 11-26invalid arithmetic operand, 8-36, 8-38MS-DOS compatibility mode, D-1numeric overflow, 8-40numeric underflow, 8-41software handling, 8-45stack overflow, 8-7, 8-37stack underflow, 8-7, 8-36, 8-37summary of, 8-34synchronization, 8-43x87 FPU instructionsarithmetic vs.