Volume 1 Basic Architecture (794100), страница 35

Файл №794100 Volume 1 Basic Architecture (Intel and AMD manuals) 35 страницаVolume 1 Basic Architecture (794100) страница 352019-04-282019-04-28СтудИзба

Intel and AMD manuals

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 35)

1 5-21INSTRUCTION SET SUMMARYADDSDAdd scalar double precision floating-point valuesSUBPDSubtract scalar double-precision floating-point valuesSUBSDSubtract scalar double-precision floating-point valuesMULPDMultiply packed double-precision floating-point valuesMULSDMultiply scalar double-precision floating-point valuesDIVPDDivide packed double-precision floating-point valuesDIVSDDivide scalar double-precision floating-point valuesSQRTPDCompute packed square roots of packed double-precisionfloating-point valuesSQRTSDCompute scalar square root of scalar double-precision floatingpoint valuesMAXPDReturn maximum packed double-precision floating-point valuesMAXSDReturn maximum scalar double-precision floating-point valuesMINPDReturn minimum packed double-precision floating-point valuesMINSDReturn minimum scalar double-precision floating-point values5.6.1.3SSE2 Logical InstructionsSSE2 logical instructions preform AND, AND NOT, OR, and XOR operations on packeddouble-precision floating-point values.ANDPDPerform bitwise logical AND of packed double-precision floatingpoint valuesANDNPDPerform bitwise logical AND NOT of packed double-precisionfloating-point valuesORPDPerform bitwise logical OR of packed double-precision floatingpoint valuesXORPDPerform bitwise logical XOR of packed double-precision floatingpoint values5.6.1.4SSE2 Compare InstructionsSSE2 compare instructions compare packed and scalar double-precision floatingpoint values and return the results of the comparison either to the destinationoperand or to the EFLAGS register.CMPPDCompare packed double-precision floating-point valuesCMPSDCompare scalar double-precision floating-point valuesCOMISDPerform ordered comparison of scalar double-precision floatingpoint values and set flags in EFLAGS registerUCOMISDPerform unordered comparison of scalar double-precisionfloating-point values and set flags in EFLAGS register.5-22 Vol.

1INSTRUCTION SET SUMMARY5.6.1.5SSE2 Shuffle and Unpack InstructionsSSE2 shuffle and unpack instructions shuffle or interleave double-precision floatingpoint values in packed double-precision floating-point operands.SHUFPDShuffles values in packed double-precision floating-pointoperandsUNPCKHPDUnpacks and interleaves the high values from two packeddouble-precision floating-point operandsUNPCKLPDUnpacks and interleaves the low values from two packeddouble-precision floating-point operands5.6.1.6SSE2 Conversion InstructionsSSE2 conversion instructions convert packed and individual doubleword integers intopacked and scalar double-precision floating-point values and vice versa. They alsoconvert between packed and scalar single-precision and double-precision floatingpoint values.CVTPD2PIConvert packed double-precision floating-point values to packeddoubleword integers.CVTTPD2PIConvert with truncation packed double-precision floating-pointvalues to packed doubleword integersCVTPI2PDConvert packed doubleword integers to packed double-precisionfloating-point valuesCVTPD2DQConvert packed double-precision floating-point values to packeddoubleword integersCVTTPD2DQConvert with truncation packed double-precision floating-pointvalues to packed doubleword integersCVTDQ2PDConvert packed doubleword integers to packed double-precisionfloating-point valuesCVTPS2PDConvert packed single-precision floating-point values to packeddouble-precision floating-point valuesCVTPD2PSConvert packed double-precision floating-point values to packedsingle-precision floating-point valuesCVTSS2SDConvert scalar single-precision floating-point values to scalardouble-precision floating-point valuesCVTSD2SSConvert scalar double-precision floating-point values to scalarsingle-precision floating-point valuesCVTSD2SIConvert scalar double-precision floating-point values to adoubleword integerCVTTSD2SIConvert with truncation scalar double-precision floating-pointvalues to scalar doubleword integersCVTSI2SDConvert doubleword integer to scalar double-precision floatingpoint valueVol.

1 5-23INSTRUCTION SET SUMMARY5.6.2SSE2 Packed Single-Precision Floating-Point InstructionsSSE2 packed single-precision floating-point instructions perform conversion operations on single-precision floating-point and integer operands. These instructionsrepresent enhancements to the SSE single-precision floating-point instructions.CVTDQ2PSConvert packed doubleword integers to packed single-precisionfloating-point valuesCVTPS2DQConvert packed single-precision floating-point values to packeddoubleword integersCVTTPS2DQConvert with truncation packed single-precision floating-pointvalues to packed doubleword integers5.6.3SSE2 128-Bit SIMD Integer InstructionsSSE2 SIMD integer instructions perform additional operations on packed words,doublewords, and quadwords contained in XMM and MMX registers.MOVDQAMove aligned double quadword.MOVDQUMove unaligned double quadwordMOVQ2DQMove quadword integer from MMX to XMM registersMOVDQ2QMove quadword integer from XMM to MMX registersPMULUDQMultiply packed unsigned doubleword integersPADDQAdd packed quadword integersPSUBQSubtract packed quadword integersPSHUFLWShuffle packed low wordsPSHUFHWShuffle packed high wordsPSHUFDShuffle packed doublewordsPSLLDQShift double quadword left logicalPSRLDQShift double quadword right logicalPUNPCKHQDQUnpack high quadwordsPUNPCKLQDQUnpack low quadwords5.6.4SSE2 Cacheability Control and Ordering InstructionsSSE2 cacheability control instructions provide additional operations for caching ofnon-temporal data when storing data from XMM registers to memory.

LFENCE andMFENCE provide additional control of instruction ordering on store operations.CLFLUSHFlushes and invalidates a memory operand and its associatedcache line from all levels of the processor’s cache hierarchyLFENCESerializes load operationsMFENCESerializes load and store operations5-24 Vol. 1INSTRUCTION SET SUMMARYPAUSEImproves the performance of “spin-wait loops”MASKMOVDQUNon-temporal store of selected bytes from an XMM register intomemoryMOVNTPDNon-temporal store of two packed double-precision floatingpoint values from an XMM register into memoryMOVNTDQNon-temporal store of double quadword from an XMM registerinto memoryMOVNTINon-temporal store of a doubleword from a general-purposeregister into memory5.7SSE3 INSTRUCTIONSThe SSE3 extensions offers 13 instructions that accelerate performance of StreamingSIMD Extensions technology, Streaming SIMD Extensions 2 technology, and x87-FPmath capabilities.

These instructions can be grouped into the following categories:••••••One x87FPU instruction used in integer conversionOne SIMD integer instruction that addresses unaligned data loadsTwo SIMD floating-point packed ADD/SUB instructionsFour SIMD floating-point horizontal ADD/SUB instructionsThree SIMD floating-point LOAD/MOVE/DUPLICATE instructionsTwo thread synchronization instructionsSSE3 instructions can only be executed on Intel 64 and IA-32 processors thatsupport SSE3 extensions. Support for these instructions can be detected with theCPUID instruction.

See the description of the CPUID instruction in Chapter 3,“Instruction Set Reference, A-M,” of the Intel® 64 and IA-32 Architectures SoftwareDeveloper’s Manual, Volume 2A.The sections that follow describe each subgroup.5.7.1FISTTP5.7.2LDDQUSSE3 x87-FP Integer Conversion InstructionBehaves like the FISTP instruction but uses truncation, irrespective of the rounding mode specified in the floating-point controlword (FCW)SSE3 Specialized 128-bit Unaligned Data Load InstructionSpecial 128-bit unaligned load designed to avoid cache linesplitsVol.

1 5-25INSTRUCTION SET SUMMARY5.7.3SSE3 SIMD Floating-Point Packed ADD/SUB InstructionsADDSUBPSPerforms single-precision addition on the second and fourthpairs of 32-bit data elements within the operands; single-precision subtraction on the first and third pairsADDSUBPDPerforms double-precision addition on the second pair of quadwords, and double-precision subtraction on the first pair5.7.4SSE3 SIMD Floating-Point Horizontal ADD/SUB InstructionsHADDPSPerforms a single-precision addition on contiguous dataelements.

The first data element of the result is obtained byadding the first and second elements of the first operand; thesecond element by adding the third and fourth elements of thefirst operand; the third by adding the first and second elementsof the second operand; and the fourth by adding the third andfourth elements of the second operand.HSUBPSPerforms a single-precision subtraction on contiguous dataelements. The first data element of the result is obtained bysubtracting the second element of the first operand from thefirst element of the first operand; the second element bysubtracting the fourth element of the first operand from the thirdelement of the first operand; the third by subtracting the secondelement of the second operand from the first element of thesecond operand; and the fourth by subtracting the fourthelement of the second operand from the third element of thesecond operand.HADDPDPerforms a double-precision addition on contiguous dataelements.

The first data element of the result is obtained byadding the first and second elements of the first operand; thesecond element by adding the first and second elements of thesecond operand.HSUBPDPerforms a double-precision subtraction on contiguous dataelements. The first data element of the result is obtained bysubtracting the second element of the first operand from thefirst element of the first operand; the second element bysubtracting the second element of the second operand from thefirst element of the second operand.5.7.5MOVSHDUP5-26 Vol. 1SSE3 SIMD Floating-Point LOAD/MOVE/DUPLICATEInstructionsLoads/moves 128 bits; duplicating the second and fourth 32-bitdata elementsINSTRUCTION SET SUMMARYMOVSLDUPLoads/moves 128 bits; duplicating the first and third 32-bit dataelementsMOVDDUPLoads/moves 64 bits (bits[63:0] if the source is a register) andreturns the same 64 bits in both the lower and upper halves ofthe 128-bit result register; duplicates the 64 bits from thesource5.7.6SSE3 Agent Synchronization InstructionsMONITORSets up an address range used to monitor write-back storesMWAITEnables a logical processor to enter into an optimized state whilewaiting for a write-back store to the address range set up by theMONITOR instruction5.8SUPPLEMENTAL STREAMING SIMD EXTENSIONS 3(SSSE3) INSTRUCTIONSSSSE3 provide 32 instructions (represented by 14 mnemonics) to accelerate computations on packed integers.

These include:•••Twelve instructions that perform horizontal addition or subtraction operations.•Two instructions that accelerate packed-integer multiply operations and produceinteger values with scaling.•Two instructions that perform a byte-wise, in-place shuffle according to thesecond shuffle control operand.•Six instructions that negate packed integers in the destination operand if thesigns of the corresponding element in the source operand is less than zero.•Two instructions that align data from the composite of two operands.Six instructions that evaluate absolute values.Two instructions that perform multiply and add operations and speed up theevaluation of dot products.SSSE3 instructions can only be executed on Intel 64 and IA-32 processors thatsupport SSSE3 extensions. Support for these instructions can be detected with theCPUID instruction.

Характеристики

Тип файла

PDF-файл

Размер

3,16 Mb

Материал

Intel and AMD manuals

Тип материала

Книга

Предмет

Архитектура ЭВМ

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.