Volume 4 128-Bit Media Instructions (794098), страница 21
Текст из файла (страница 21)
3.09—July 2007RealVirtual8086 ProtectedCause of ExceptionUnderflow exception(UE)XXXA rounded result was too small to fit into the format ofthe destination operand.Precision exception(PE)XXXA result could not be represented exactly in thedestination format.120HADDPDInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyHADDPSHorizontal Add Packed SingleAdds pairs of packed single-precision floating-point values simultaneously.
The sum of the values inthe first and second doublewords of the destination operand is stored in the first doubleword of thedestination operand; the sum of the values in the third and fourth doubleword of the destinationoperand is stored in the second doubleword of the destination operand; the sum of the values in the firstand second doubleword of the source operand is stored in the third doubleword of the destinationoperand; and the sum of the values in the third and fourth doubleword of the source operand is stored inthe fourth doubleword of the destination operand.The HADDPS instruction is an SSE3 instruction.
The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeHADDPS xmm1,xmm2/mem128DescriptionF2 0F 7C /rAdds the first and second packed single-precisionvalues in xmm1 and stores the sum in xmm1[0-31];adds the third and fourth single-precision values inxmm1 and stores the sum in xmm1[32–63]; adds thefirst and second packed single-precision values inxmm2 or a 128-bit memory operand and stores the sumin the xmm1[64–95]; adds the third and fourth packedsingle-precision values in xmm2 or a 128-bit memoryoperand and stores the result in xmm1[96–127].xmm112796 9564 63xmm2/mem12832 31127096 9564 6332 310addaddaddaddRelated InstructionsHADDPD, HSUBPD, HSUBPSrFLAGS AffectedNoneInstruction ReferenceHADDPS121AMD64 Technology26568—Rev.
3.09—July 2007MXCSR Flags AffectedMM17FZ15RC14PM1312UM11OM10ZMDM9IM87DAZ6PEUEOEMMM543ZE2DEIEMM10Note: A flag that may be set to one or cleared to zero is M (modified). Unaffected flags are blank.ExceptionsExceptionInvalid opcode, #UDRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE3 instructions are not supported, asindicated by ECX bit 0 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 was cleared to 0.XXXThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was cleared to0.See SIMD Floating-Point Exceptions, below, fordetails.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.XThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was set to 1.See SIMD Floating-Point Exceptions, below, fordetails.General protection, #GPXSIMD Floating-PointException, #XF122XXHADDPSInstruction Reference26568—Rev.
3.09—July 2007ExceptionRealAMD64 TechnologyVirtual8086 ProtectedCause of ExceptionSIMD Floating-Point ExceptionsXXXA source operand was an SNaN value.XXX+infinity was added to –infinity.Denormalized-operandexception (DE)XXXA source operand was a denormal value.Overflow exception (OE)XXXA rounded result was too large to fit into the format ofthe destination operand.Underflow exception(UE)XXXA rounded result was too small to fit into the format ofthe destination operand.Precision exception(PE)XXXA result could not be represented exactly in thedestination format.Invalid-operationexception (IE)Instruction ReferenceHADDPS123AMD64 Technology26568—Rev. 3.09—July 2007HSUBPDHorizontal Subtract Packed DoubleSubtracts the packed double-precision floating-point value in the upper quadword of the destinationXMM register operand from the lower quadword of the destination operand and stores the result in thelower quadword of the destination operand; subtracts the value in the upper quadword of the sourceXMM register or 128-bit memory operand from the value in the lower quadword of the source operandand stores the result in the upper quadword of the destination XMM register.The HSUBPD instruction is an SSE3 instruction.
The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeHSUBPD xmm1,xmm2/mem12866 0F 7D /rDescriptionSubtracts the packed double-precision value in theupper 64 bits of the source register from the value in thelower 64 bits of the source register or 128-bit memoryoperand and stores the difference in the upper 64 bits ofthe destination XMM register; Subtracts the upper 64bits of the destination register from the lower 64 bits ofthe destination register and stores the result in the lower64 bits of the destination XMM register.xmm1127xmm2/mem12864 63012764 630subsubRelated InstructionsHSUBPS, HADDPD, HADDPSrFLAGS AffectedNone124HSUBPDInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyMXCSR Flags AffectedMM17FZ15RC14PM1312UM11OM10ZMDM9IM87DAZ6PEUEOEMMM543ZE2DEIEMM10Note: A flag that may be set to one or cleared to zero is M (modified). Unaffected flags are blank.ExceptionsExceptionInvalid opcode, #UDRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE3 instructions are not supported, asindicated by ECX bit 0 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 was cleared to 0.XXXThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was cleared to0.See SIMD Floating-Point Exceptions, below, fordetails.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.XXThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was set to 1.See SIMD Floating-Point Exceptions below for details.General protection, #GPXSIMD Floating-PointException, #XFInstruction ReferenceXHSUBPD125AMD64 TechnologyException26568—Rev.
3.09—July 2007RealVirtual8086 ProtectedCause of ExceptionSIMD Floating-Point ExceptionsXXXA source operand was an SNaN value.XXX+infinity was subtracted from +infinity.XXX–infinity was subtracted from –infinity.Denormalized-operandexception (DE)XXXA source operand was a denormal value.Overflow exception (OE)XXXA rounded result was too large to fit into the format ofthe destination operand.Underflow exception(UE)XXXA rounded result was too small to fit into the format ofthe destination operand.Precision exception(PE)XXXA result could not be represented exactly in thedestination format.Invalid-operationexception (IE)126HSUBPDInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyHSUBPSHorizontal Subtract Packed SingleSubtracts the packed single-precision floating-point value in the second doubleword of the destinationXMM register from that in the first doubleword of the destination register and stores the difference inthe first doubleword of the destination register; subtracts the value in the fourth doubleword of thedestination register from that in the third doubleword of the destination register and stores the result inthe second doubleword of the destination register; subtracts the value in the second doubleword of thesource XMM register or 128-bit memory operand from the first doubleword of the source operand andstores the result in the third doubleword of the destination XMM register; subtracts the singleprecision floating-point value in the fourth doubleword of the source operand from the thirddoubleword of the source operand and stores the result in the fourth doubleword of the destinationXMM register.The HSUBPS instruction is an SSE3 instruction.
The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeHSUBPS xmm1,xmm2/mem128DescriptionF2 0F 7D /rSubtracts the second 32 bits of the destination operandfrom the first 32 bits of the destination operand andstores the difference in the first doubleword of thedestination operand; subtracts the fourth 32 bits of thedestination operand from the third 32-bits of thedestination operand and stores the difference in thesecond doubleword of the destination operand;subtracts the second 32 bits of the source operand fromthe first 32 bits of the source operand and stores thedifference in the third doubleword of the destinationoperand; subtracts the fourth 32-bits of the sourceoperand from the third 32 bits of the source operandand stores the difference in the fourth doubleword of thedestination operand.xmm112796 9564 63xmm2/mem12832 31012796 9564 6332 310subsubsubsubRelated InstructionsHSUBPD, HADDPD, HADDPSInstruction ReferenceHSUBPS127AMD64 Technology26568—Rev.
3.09—July 2007rFLAGS AffectedNoneMXCSR Flags AffectedMM17FZ15RC14PM1312UM11OM10ZMDM98IM7DAZ6PEUEOEMMM543ZE2DEIEMM10Note: A flag that may be set to one or cleared to zero is M (modified). Unaffected flags are blank.ExceptionsExceptionInvalid opcode, #UDRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE3 instructions are not supported, asindicated by ECX bit 0 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 was cleared to 0.XXXThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was cleared to0.See SIMD Floating-Point Exceptions, below, fordetails.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.XXThere was an unmasked SIMD floating-pointexception while CR4.OSXMMEXCPT was set to 1.See SIMD Floating-Point Exceptions below for details.General protection, #GPXSIMD Floating-PointException, #XF128XHSUBPSInstruction Reference26568—Rev.
3.09—July 2007ExceptionRealAMD64 TechnologyVirtual8086 ProtectedCause of ExceptionSIMD Floating-Point ExceptionsXXXA source operand was an SNaN value.XXX+infinity was subtracted from +infinity.XXX–infinity was subtracted from –infinity.Denormalized-operandexception (DE)XXXA source operand was a denormal value.Overflow exception (OE)XXXA rounded result was too large to fit into the format ofthe destination operand.Underflow exception(UE)XXXA rounded result was too small to fit into the format ofthe destination operand.Precision exception(PE)XXXA result could not be represented exactly in thedestination format.Invalid-operationexception (IE)Instruction ReferenceHSUBPS129AMD64 Technology26568—Rev.