Volume 4 128-Bit Media Instructions (794098), страница 6
Текст из файла (страница 6)
3.09—July 2007TPRTask priority register (CR8), a new register introduced in the AMD64 architecture to speedinterrupt management.TRTask register.Endian OrderThe x86 and AMD64 architectures address memory using little-endian byte-ordering. Multibytevalues are stored with their least-significant byte at the lowest byte address, and they are illustratedwith their least significant byte at the right side. Strings are illustrated in reverse order, because theaddresses of their bytes increase from right to left.Related Documents•••••••••••••••Peter Abel, IBM PC Assembly Language and Programming, Prentice-Hall, Englewood Cliffs, NJ,1995.Rakesh Agarwal, 80x86 Architecture & Programming: Volume II, Prentice-Hall, EnglewoodCliffs, NJ, 1991.AMD, AMD-K6™ MMX™ Enhanced Processor Multimedia Technology, Sunnyvale, CA, 2000.AMD, 3DNow!™ Technology Manual, Sunnyvale, CA, 2000.AMD, AMD Extensions to the 3DNow!™ and MMX™ Instruction Sets, Sunnyvale, CA, 2000.Don Anderson and Tom Shanley, Pentium Processor System Architecture, Addison-Wesley, NewYork, 1995.Nabajyoti Barkakati and Randall Hyde, Microsoft Macro Assembler Bible, Sams, Carmel, Indiana,1992.Barry B.
Brey, 8086/8088, 80286, 80386, and 80486 Assembly Language Programming,Macmillan Publishing Co., New York, 1994.Barry B. Brey, Programming the 80286, 80386, 80486, and Pentium Based Personal Computer,Prentice-Hall, Englewood Cliffs, NJ, 1995.Ralf Brown and Jim Kyle, PC Interrupts, Addison-Wesley, New York, 1994.Penn Brumm and Don Brumm, 80386/80486 Assembly Language Programming, WindcrestMcGraw-Hill, 1993.Geoff Chappell, DOS Internals, Addison-Wesley, New York, 1994.Chips and Technologies, Inc.
Super386 DX Programmer’s Reference Manual, Chips andTechnologies, Inc., San Jose, 1992.John Crawford and Patrick Gelsinger, Programming the 80386, Sybex, San Francisco, 1987.Cyrix Corporation, 5x86 Processor BIOS Writer's Guide, Cyrix Corporation, Richardson, TX,1995.xxivPreface26568—Rev. 3.09—July 2007AMD64 Technology••Cyrix Corporation, M1 Processor Data Book, Cyrix Corporation, Richardson, TX, 1996.Cyrix Corporation, MX Processor MMX Extension Opcode Table, Cyrix Corporation, Richardson,TX, 1996.••Cyrix Corporation, MX Processor Data Book, Cyrix Corporation, Richardson, TX, 1997.Ray Duncan, Extending DOS: A Programmer's Guide to Protected-Mode DOS, Addison Wesley,NY, 1991.William B.
Giles, Assembly Language Programming for the Intel 80xxx Family, Macmillan, NewYork, 1991.Frank van Gilluwe, The Undocumented PC, Addison-Wesley, New York, 1994.John L. Hennessy and David A. Patterson, Computer Architecture, Morgan Kaufmann Publishers,San Mateo, CA, 1996.Thom Hogan, The Programmer’s PC Sourcebook, Microsoft Press, Redmond, WA, 1991.••••••••••••••••••Hal Katircioglu, Inside the 486, Pentium, and Pentium Pro, Peer-to-Peer Communications, MenloPark, CA, 1997.IBM Corporation, 486SLC Microprocessor Data Sheet, IBM Corporation, Essex Junction, VT,1993.IBM Corporation, 486SLC2 Microprocessor Data Sheet, IBM Corporation, Essex Junction, VT,1993.IBM Corporation, 80486DX2 Processor Floating Point Instructions, IBM Corporation, EssexJunction, VT, 1995.IBM Corporation, 80486DX2 Processor BIOS Writer's Guide, IBM Corporation, Essex Junction,VT, 1995.IBM Corporation, Blue Lightning 486DX2 Data Book, IBM Corporation, Essex Junction, VT,1994.Institute of Electrical and Electronics Engineers, IEEE Standard for Binary Floating-PointArithmetic, ANSI/IEEE Std 754-1985.Institute of Electrical and Electronics Engineers, IEEE Standard for Radix-Independent FloatingPoint Arithmetic, ANSI/IEEE Std 854-1987.Muhammad Ali Mazidi and Janice Gillispie Mazidi, 80X86 IBM PC and Compatible Computers,Prentice-Hall, Englewood Cliffs, NJ, 1997.Hans-Peter Messmer, The Indispensable Pentium Book, Addison-Wesley, New York, 1995.Karen Miller, An Assembly Language Introduction to Computer Architecture: Using the IntelPentium, Oxford University Press, New York, 1999.Stephen Morse, Eric Isaacson, and Douglas Albert, The 80386/387 Architecture, John Wiley &Sons, New York, 1987.NexGen Inc., Nx586 Processor Data Book, NexGen Inc., Milpitas, CA, 1993.NexGen Inc., Nx686 Processor Data Book, NexGen Inc., Milpitas, CA, 1994.PrefacexxvAMD64 Technology•••••••••••26568—Rev.
3.09—July 2007Bipin Patwardhan, Introduction to the Streaming SIMD Extensions in the Pentium III,www.x86.org/articles/sse_pt1/ simd1.htm, June, 2000.Peter Norton, Peter Aitken, and Richard Wilton, PC Programmer’s Bible, Microsoft Press,Redmond, WA, 1993.PharLap 386|ASM Reference Manual, Pharlap, Cambridge MA, 1993.PharLap TNT DOS-Extender Reference Manual, Pharlap, Cambridge MA, 1995.Sen-Cuo Ro and Sheau-Chuen Her, i386/i486 Advanced Programming, Van Nostrand Reinhold,New York, 1993.Jeffrey P. Royer, Introduction to Protected Mode Programming, course materials for an onsiteclass, 1992.Tom Shanley, Protected Mode System Architecture, Addison Wesley, NY, 1996.SGS-Thomson Corporation, 80486DX Processor SMM Programming Manual, SGS-ThomsonCorporation, 1995.Walter A.
Triebel, The 80386DX Microprocessor, Prentice-Hall, Englewood Cliffs, NJ, 1992.John Wharton, The Complete x86, MicroDesign Resources, Sebastopol, California, 1994.Web sites and newsgroups:- www.amd.com- news.comp.arch- news.comp.lang.asm.x86- news.intel.microprocessors- news.microsoftxxviPreface26568—Rev. 3.09—July 20071AMD64 Technology128-Bit Media Instruction ReferenceThis chapter describes the function, mnemonic syntax, opcodes, affected flags of the 128-bit mediainstructions and the possible exceptions they generate.
These instructions load, store, or operate ondata located in 128-bit XMM registers. Most of the instructions operate in parallel on sets of packedelements called vectors, although a few operate on scalars. These instructions define both integer andfloating-point operations. They include the SSE, SSE2 and SSE3 instructions.Each instruction that performs a vector (packed) operation is illustrated with a diagram. Figure 1-1shows the conventions used in these diagrams.
The particular diagram shows the PSLLW (packed shiftleft logical words) instruction.Arrowheads going to a source operandindicate the writing of the result. In thiscase, the result is written to the first sourceoperand, which is also the destination operand..First Source Operand(and Destination Operand)Second Source Operandxmm1xmm2/mem128.....127 112 111 96 95 80 79 64 63 48 47 32 31 16 15.....0127 112 111 96 95 80 79 64 63 48 47 32 31 16 15......0.shift leftshift left513-323.epsArrowheads coming from a source operandindicate that the source operand providesa control function. In this case, the secondsource operand specifies the number of bitsto shift, and the first source operand specifiesthe data to be shifted.Figure 1-1.Operation.In this case,a bitwiseshift-left.Ellipses indicate that the operationis repeated for each element of thesource vectors.
In this case, there are8 elements in each source vector, sothe operation is performed 8 times,in parallel.File name ofthis figure (fordocumentationcontrol)Diagram Conventions for 128-Bit Media InstructionsGray areas in diagrams indicate unmodified operand bits.Instruction Reference1AMD64 Technology26568—Rev.
3.09—July 2007The 128-bit media instructions are useful in high-performance applications that operate on blocks ofdata. Because each instruction can independently and simultaneously perform a single operation onmultiple elements of a vector, the instructions are classified as single-instruction, multiple-data(SIMD) instructions. A few 128-bit media instructions convert operands in XMM registers to operandsin GPR, MMX™, or x87 registers (or vice versa), or save or restore XMM state.Hardware support for a specific 128-bit media instruction depends on the presence of at least one ofthe following CPUID functions:••••FXSAVE and FXRSTOR, indicated by EDX bit 24 returned by CPUID function 0000_0001h andfunction 8000_0001h.SSE, indicated by EDX bit 25 returned by CPUID function 0000_0001h.SSE2, indicated by EDX bit 26 returned by CPUID function 0000_0001h.SSE3, indicated by ECX bit 0 returned by CPUID function 0000_0001h.The 128-bit media instructions can be used in legacy mode or long mode.
Their use in long mode isavailable if the following CPUID function is set:•Long Mode, indicated by EDX bit 29 returned by CPUID function 8000_0001h.Compilation of 128-bit media programs for execution in 64-bit mode offers four primary advantages:access to the eight extended XMM registers (for a register set consisting of XMM0–XMM15), accessto the eight extended, 64-bit general-purpose registers (for a register set consisting of GPR0–GPR15),access to the 64-bit virtual address space, and access to the RIP-relative addressing mode.For further information, see:••••2“128-Bit Media and Scientific Programming” in Volume 1.“Summary of Registers and Data Types” in Volume 3.“Notation” in Volume 3.“Instruction Prefixes” in Volume 3.Instruction Reference26568—Rev.
3.09—July 2007ADDPDAMD64 TechnologyAdd Packed Double-Precision Floating-PointAdds each packed double-precision floating-point value in the first source operand to thecorresponding packed double-precision floating-point value in the second source operand and writesthe result of each addition in the corresponding quadword of the destination (first source). The firstsource/destination operand is an XMM register. The second source operand is another XMM registeror 128-bit memory location.The ADDPD instruction is an SSE2 instruction.
The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeADDPD xmm1, xmm2/mem12866 0F 58 /rDescriptionAdds two packed double-precision floating-point valuesin an XMM register and another XMM register or 128-bitmemory location and writes the result in the destinationXMM register.xmm1127xmm2/mem12864 63012764 630addaddaddpd.epsRelated InstructionsADDPS, ADDSD, ADDSSrFLAGS AffectedNoneInstruction ReferenceADDPD3AMD64 Technology26568—Rev. 3.09—July 2007MXCSR Flags AffectedMM17FZ15RC14PM1312UMOM1110ZMDM98IM7DAZ6PEUEOEMMM543ZE2DEIEMM10Note: A flag that may be set to one or cleared to zero is M (modified).