Volume 1 Application Programming (794095), страница 57
Текст из файла (страница 57)
Theinstructions then write the maximum (PMAXUB) or minimum (PMINUB) of the two values for eachcomparison into the corresponding byte of the destination.The PMAXSW and PMINSW instructions perform operations analogous to the PMAXUB andPMINUB instructions, except on 16-bit signed integer values.64-Bit Media Programming221AMD64 Technology24592—Rev.
3.13—July 20075.6.9 LogicalThe vector-logic instructions perform Boolean logic operations, including AND, OR, and exclusiveOR.And• PAND—Packed Logical Bitwise AND• PANDN—Packed Logical Bitwise AND NOTThe PAND instruction performs a bitwise logical AND of the values in the first and second operandsand writes the result to the destination.The PANDN instruction inverts the first operand (creating a one’s complement of the operand), ANDsit with the second operand, and writes the result to the destination, and writes the result to thedestination.
Table 5-5 shows an example.Table 5-5.Example PANDN Bit ValuesOperand1 BitOperand1 Bit(Inverted)Operand2 BitPANDNResult Bit1010100001110100PAND can be used with the value 7FFFFFFF7FFFFFFFh to compute the absolute value of theelements of a 64-bit media floating-point vector operand. This method is equivalent to the x87 FABS(floating-point absolute value) instruction.Or• POR—Packed Logical Bitwise ORThe POR instruction performs a bitwise logical OR of the values in the first and second operands andwrites the result to the destination.Exclusive Or• PXOR—Packed Logical Bitwise Exclusive ORThe PXOR instruction performs a bitwise logical exclusive OR of the values in the first and secondoperands and writes the result to the destination. PXOR can be used to clear all bits in an MMXregister by specifying the same register for both operands.
PXOR can also used with the value8000000080000000h to change the sign bits of the elements of a 64-bit media floating-point vectoroperand. This method is equivalent to the x87 floating-point change sign (FCHS) instruction.22264-Bit Media Programming24592—Rev. 3.13—July 2007AMD64 Technology5.6.10 Save and Restore StateThese instructions save and restore the processor state for 64-bit media instructions.Save and Restore 64-Bit Media and x87 State• FSAVE—Save x87 and MMX State• FNSAVE—Save No-Wait x87 and MMX State• FRSTOR—Restore x87 and MMX StateThese instructions save and restore the entire processor state for x87 floating-point instructions and64-bit media instructions. The instructions save and restore either 94 or 108 bytes of data, dependingon the effective operand size.Assemblers issue FSAVE as an FWAIT instruction followed by an FNSAVE instruction.
Thus, FSAVE(but not FNSAVE) reports pending unmasked x87 floating-point exceptions before saving the state.After saving the state, the processor initializes the x87 state by performing the equivalent of an FINITinstruction.Save and Restore 128-Bit, 64-Bit, and x87 State• FXSAVE—Save XMM, MMX, and x87 State• FXRSTOR—Restore XMM, MMX, and x87 StateThe FXSAVE and FXRSTOR instructions save and restore the entire 512-byte processor state for 128bit media instructions, 64-bit media instructions, and x87 floating-point instructions.
The architecturesupports two memory formats for FXSAVE and FXRSTOR, a 512-byte 32-bit legacy format and a512-byte 64-bit format. Selection of the 32-bit or 64-bit format is determined by the effective operandsize for the FXSAVE and FXRSTOR instructions. For details, see “FXSAVE and FXRSTORInstructions” in Volume 2.FXSAVE and FXRSTOR execute faster than FSAVE/FNSAVE and FRSTOR. However, unlikeFSAVE and FNSAVE, FXSAVE does not initialize the x87 state, and like FNSAVE it does not reportpending unmasked x87 floating-point exceptions. For details, see “Saving and Restoring State” onpage 234.5.7Instruction Summary—Floating-Point InstructionsThis section summarizes the functions of the floating-point (3DNow! and a few SSE and SSE2)instructions in the 64-bit media instruction subset.
These include floating-point instructions that use anMMX register for source or destination and data-conversion instructions that convert from floatingpoint to integers formats. For a summary of the integer instructions in the 64-bit media instructionsubset, including data-conversion instructions that convert from integer to floating-point formats, see“Instruction Summary—Integer Instructions” on page 207.64-Bit Media Programming223AMD64 Technology24592—Rev. 3.13—July 2007For a summary of the 128-bit media floating-point instructions, see “Instruction Summary—FloatingPoint Instructions” on page 156.
For a summary of the x87 floating-point instructions, see “InstructionSummary” on page 262.The instructions are organized here by functional group—such as data-transfer, vector arithmetic, andso on. Software running at any privilege level can use any of these instructions, if the CPUIDinstruction reports support for the instructions (see “Feature Detection” on page 229). More detail onindividual instructions is given in the alphabetically organized “64-Bit Media Instruction Reference”in Volume 5.5.7.1 SyntaxThe 64-bit media floating-point instructions have the same syntax rules as those for the 64-bit mediainteger instructions, described in “Syntax” on page 207, except that the mnemonics of most floatingpoint instructions begin with the following prefix:•PF—Packed floating-point5.7.2 Data ConversionThese data-conversion instructions convert operands from floating-point to integer formats.
Theinstructions take 32-bit or 64-bit floating-point source operands. For data-conversion instructions thattake 64-bit integer source operands, see “Data Conversion” on page 211. For data-conversioninstructions that take 128-bit source operands, see “Data Conversion” on page 139 and “DataConversion” on page 162.Convert Floating-Point to Integer• CVTPS2PI—Convert Packed Single-Precision Floating-Point to Packed Doubleword Integers• CVTTPS2PI—Convert Packed Single-Precision Floating-Point to Packed Doubleword Integers,Truncated• CVTPD2PI—Convert Packed Double-Precision Floating-Point to Packed Doubleword Integers• CVTTPD2PI—Convert Packed Double-Precision Floating-Point to Packed Doubleword Integers,Truncated• PF2IW—Packed Floating-Point to Integer Word Conversion• PF2ID—Packed Floating-Point to Integer Doubleword ConversionThe CVTPS2PI and CVTTPS2PI instructions convert two single-precision (32-bit) floating-pointvalues in the second operand (the low-order 64 bits of an XMM register or a 64-bit memory location)to two 32-bit signed integers, and write the converted values into the first operand (an MMX register).For the CVTPS2PI instruction, if the conversion result is an inexact value, the value is rounded asspecified in the rounding control (RC) field of the MXCSR register (“MXCSR Register” on page 117),but for the CVTTPS2PI instruction such a result is truncated (rounded toward zero).The CVTPD2PI and CVTTPD2PI instructions perform conversions analogous to CVTPS2PI andCVTTPS2PI but for two double-precision (64-bit) floating-point values.22464-Bit Media Programming24592—Rev.
3.13—July 2007AMD64 TechnologyThe 3DNow! PF2IW instruction converts two single-precision floating-point values in the secondoperand (an MMX register or a 64-bit memory location) to two 16-bit signed integer values, signextended to 32-bits, and writes the converted values into the first operand (an MMX register). The3DNow! PF2ID instruction converts two single-precision floating-point values in the second operandto two 32-bit signed integer values, and writes the converted values into the first operand. If the resultof either conversion is an inexact value, the value is truncated (rounded toward zero).As described in “Floating-Point Data Types” on page 205, PF2IW and PF2ID do not fully comply withthe IEEE-754 standard.
Conversion of some source operands of the C type float (IEEE-754 singleprecision)—specifically NaNs, infinities, and denormals—are not supported. Attempts to convert suchsource operands produce undefined results, and no exceptions are generated.5.7.3 ArithmeticThe floating-point vector-arithmetic instructions perform an arithmetic operation on two floatingpoint operands. For a description of 3DNow! instruction saturation on overflow and underflowconditions, see “Floating-Point Data Types” on page 205.Addition• PFADD—Packed Floating-Point AddThe PFADD instruction adds each single-precision floating-point value in the first operand (an MMXregister) to the corresponding single-precision floating-point value in the second operand (an MMXregister or 64-bit memory location). The instruction then writes the result of each addition into thecorresponding doubleword of the destination.Subtraction• PFSUB—Packed Floating-Point Subtract• PFSUBR—Packed Floating-Point Subtract ReverseThe PFSUB instruction subtracts each single-precision floating-point value in the second operandfrom the corresponding single-precision floating-point value in the first operand.