Volume 5 64-Bit Media and x87 Floating-Point Instructions (794099), страница 19
Текст из файла (страница 19)
The second source operand is another MMXregister or 64-bit memory location.The PFMUL instruction is an AMD 3DNow!™ instruction. The presence of this instruction set isindicated by CPUID feature bits. (See “CPUID” in Volume 3.)AMD no longer recommends the use of 3DNow! instructions, which have been superceded by theirmore efficient 128-bit media counterparts. For a complete list of recommended instructionsubstitutions, see Appendix A, “Recommended Substitutions for 3DNow!™ Instructions” onpage 335.Recommended Instruction SubstitutionMULPSMnemonicOpcodePFMUL mmx1, mmx2/mem640F 0F /rB4DescriptionMultiplies packed single-precision floating-point values in anMMX register and another MMX register or 64-bit memorylocation and writes the result in the destination MMXregister.mmx163mmx2/mem6432 3106332 310multiplymultiplypfmul.eps106PFMULInstruction Reference26569—Rev.
3.08—July 2007AMD64 TechnologyTable 1-11. Numeric Range for the PFMUL InstructionSource 2OperandValue00Source 1 andDestination+/–Normal01+/– 01Normal1Unsupported3+/– 0+/–Unsupported01+/– 01Normal, +/– 02UndefinedUndefinedUndefinedNote:1. The sign of the result is the exclusive-OR of the signs of the source operands.2. If the absolute value of the result is less then 2–126, the result is zero with the sign being the exclusiveOR of the signs of the source operands. If the absolute value of the product is greater than or equal to2128, the result is the largest normal number with the sign being the exclusive-OR of the signs of thesource operands.3.
“Unsupported” means that the exponent is all ones (1s).Related InstructionsNonerFLAGS AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe emulate bit (EM) of CR0 was set to 1.XXXThe AMD 3DNow!™ instructions are not supported,as indicated by EDX bit 31 in CPUID function8000_0001h.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXA page fault resulted from the execution of theinstruction.XXAn unmasked x87 floating-point exception waspending.XXAn unaligned memory reference was performed whilealignment checking was enabled.Invalid opcode, #UDGeneral protection, #GPPage fault, #PFx87 floating-pointexception pending, #MFAlignment check, #ACInstruction ReferenceXPFMUL107AMD64 Technology26569—Rev.
3.08—July 2007PFNACCPacked Floating-Point Negative AccumulateSubtracts the first source operand’s high-order single-precision floating-point value from its low-ordersingle-precision floating-point value, subtracts the second source operand’s high-order singleprecision floating-point value from its low-order single-precision floating-point value, and writes eachresult to the low-order or high-order doubleword, respectively, of the destination (first source). Thefirst source/destination operand is an MMX register.
The second source operand is another MMXregister or 64-bit memory location.The numeric range for operands is shown in Table 1-12 on page 109.The PFNACC instruction is an extension to the AMD 3DNow!™ instruction set. The presence of thisinstruction set is indicated by CPUID feature bits. (See “CPUID” in Volume 3.)AMD no longer recommends the use of 3DNow! instructions, which have been superceded by theirmore efficient 128-bit media counterparts.
For a complete list of recommended instructionsubstitutions, see Appendix A, “Recommended Substitutions for 3DNow!™ Instructions” onpage 335.Recommended Instruction SubstitutionHSUBPSMnemonicOpcodePFNACC mmx1, mmx2/mem640F 0F /r8ADescriptionSubtracts the packed single-precision floating-point valuesin an MMX register or 64-bit memory location and anotherMMX register and writes each value in the destination MMXregister.mmx16332 31mmx2/mem64063subtract32 310subtractpfnacc.eps108PFNACCInstruction Reference26569—Rev. 3.08—July 2007AMD64 TechnologyTable 1-12. Numeric Range of PFNACC ResultsHigh Operand2Source Operand00Low Operand1+/– 0Normal3NormalLow OperandUnsupported5Low OperandUnsupported- High OperandNormal, +/–- High Operand04UndefinedUndefinedUndefinedNote:1.2.3.4.Least-significant floating-point value in first or second source operand.Most-significant floating-point value in first or second source operand.The sign is the logical AND of the sign of the low operand and the inverse of the sign of the high operand.If the absolute value of the infinitely precise result is less than 2–126 (but not zero), the result is a zero.If the low operand is larger in magnitude than the high operand, the sign of this zero is the same as thesign of the low operand, else it is the inverse of the sign of the high operand.
If the infinitely precise resultis exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitelyprecise result is greater than or equal to 2128, the result is the largest normal number with the sign ofthe low operand.5. “Unsupported” means that the exponent is all ones (1s).Related InstructionsPFSUB, PFACC, PFPNACCrFLAGS AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe emulate bit (EM) of CR0 was set to 1.XXXThe AMD extensions to 3DNow!™ are not supported,as indicated by EDX bit 30 in CPUID function8000_0001h.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XA page fault resulted from the execution of theinstruction.Invalid opcode, #UDGeneral protection, #GPPage fault, #PFInstruction ReferenceXPFNACC109AMD64 Technology26569—Rev.
3.08—July 2007ExceptionRealx87 floating-pointexception pending, #MFXAlignment check, #AC110Virtual8086 ProtectedCause of ExceptionXXAn unmasked x87 floating-point exception waspending.XXAn unaligned memory reference was performed whilealignment checking was enabled.PFNACCInstruction Reference26569—Rev. 3.08—July 2007PFPNACCAMD64 TechnologyPacked Floating-Point Positive-NegativeAccumulateSubtracts the first source operand’s high-order single-precision floating-point value from its low-ordersingle-precision floating-point value, adds the two single-precision values in the second sourceoperand, and writes each result to the low-order or high-order doubleword, respectively, of thedestination (first source).
The first source/destination operand is an MMX register. The second sourceoperand is another MMX register or 64-bit memory location.The numeric range for operands is shown in Table 1-13 (for the low result) and Table 1-14 (for the highresult), both on page 112.The PFPNACC instruction is an extension to the AMD 3DNow!™ instruction set.
The presence of thisinstruction set is indicated by CPUID feature bits. (See “CPUID” in Volume 3.)AMD no longer recommends the use of 3DNow! instructions, which have been superceded by theirmore efficient 128-bit media counterparts. For a complete list of recommended instructionsubstitutions, see Appendix A, “Recommended Substitutions for 3DNow!™ Instructions” onpage 335.Recommended Instruction SubstitutionADDSUBPSMnemonicOpcodePFPNACC mmx1,mmx2/mem640F 0F /r8EDescriptionSubtracts the packed single-precision floating-point valuesin an MMX register, adds the packed single-precisionfloating-point values in another MMX register or 64-bitmemory location, and writes each value in the destinationMMX register.mmx16332 31mmx2/mem64063subtract32 310addpfpnacc.epsInstruction ReferencePFPNACC111AMD64 Technology26569—Rev.
3.08—July 2007Table 1-13. Numeric Range of PFPNACC Result (Low Result)High Operand2Source Operand00Low Operand1Normal+/– 03NormalLow OperandUnsupported5Low OperandUnsupported- High OperandNormal, +/–04Undefined- High OperandUndefinedUndefinedNote:1.2.3.4.Least-significant floating-point value in first or second source operand.Most-significant floating-point value in first or second source operand.The sign is the logical AND of the sign of the low operand and the inverse of the sign of the high operand.If the absolute value of the infinitely precise result is less than 2–126 (but not zero), the result is a zero.If the low operand is larger in magnitude than the high operand, the sign of this zero is the same as thesign of the low operand, else it is the inverse of the sign of the high operand.
If the infinitely precise resultis exactly zero, the result is zero with the sign of the low operand. If the absolute value of the infinitelyprecise result is greater than or equal to 2128, the result is the largest normal number with the sign ofthe low operand.5. “Unsupported” means that the exponent is all ones (1s).Table 1-14. Numeric Range of PFPNACC Result (High Result)High Operand2Source Operand00Low Operand1+/–Normal03UnsupportedHigh OperandNormalLow OperandNormal, +/– 0Unsupported5Low OperandUndefined4High OperandUndefinedUndefinedNote:1.2.3.4.Least-significant floating-point value in first or second source operand.Most-significant floating-point value in first or second source operand.The sign is the logical AND of the signs of the low and high operands.If the absolute value of the infinitely precise result is less than 2–126 (but not zero), the result is zero withthe sign of the operand (low or high) that is larger in magnitude.
If the infinitely precise result is exactlyzero, the result is zero with the sign of the low operand. If the absolute value of the infinitely precise resultis greater than or equal to 2128, the result is the largest normal number with the sign of the low operand.5. “Unsupported” means that the exponent is all ones (1s).Related InstructionsPFADD, PFSUB, PFACC, PFNACCrFLAGS AffectedNone112PFPNACCInstruction Reference26569—Rev. 3.08—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe emulate bit (EM) of CR0 was set to 1.XXXThe AMD extensions to 3DNow!™ are not supported,as indicated by EDX bit 30 in CPUID function8000_0001h.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXA page fault resulted from the execution of theinstruction.XXAn unmasked x87 floating-point exception waspending.XXAn unaligned memory reference was performed whilealignment checking was enabled.Invalid opcode, #UDGeneral protection, #GPPage fault, #PFx87 floating-pointexception pending, #MFAlignment check, #ACInstruction ReferenceXPFPNACC113AMD64 Technology26569—Rev.
3.08—July 2007PFRCPFloating-Point Reciprocal ApproximationComputes the approximate reciprocal of the single-precision floating-point value in the low-order 32bits of an MMX register or 64-bit memory location and writes the result in both doublewords ofanother MMX register. The result is accurate to 14 bits.The PFRCP result can be forwarded to the Newton-Raphson iteration step 1 (PFRCPIT1) and NewtonRaphson iteration step 2 (PFRCPIT2) instructions to increase the accuracy of the reciprocal. The firststage of this refinement in accuracy (PFRCPIT1) requires that the input and output of the previouslyexecuted PFRCP instruction be used as input to the PFRCPIT1 instruction.The estimate contains the correct round-to-nearest value for approximately 99% of all arguments.