Volume 1 Application Programming (794095), страница 61

Файл №794095 Volume 1 Application Programming (Intel and AMD manuals) 61 страницаVolume 1 Application Programming (794095) страница 612019-04-282019-04-28СтудИзба

Intel and AMD manuals

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 61)

Organize frequently accessed constants andcoefficients into cache-line-size blocks and prefetch them. Procedures that access data arranged inmemory-bus-sized blocks, or memory-burst-sized blocks, can make optimum use of the availablememory bandwidth.For data that will be used only once in a procedure, consider using non-cacheable memory. Accesses tosuch memory are not burdened by the overhead of cache protocols.5.15.6 Prefetch DataMedia applications typically operate on large data sets.

Because of this, they make intensive use of thememory bus. Memory latency can be substantially reduced—especially for data that will be used onlyonce—by prefetching such data into various levels of the cache hierarchy. Software can use thePREFETCHx instructions very effectively in such cases, as described in “Cache and MemoryManagement” on page 66.Some of the best places to use prefetch instructions are inside loops that process large amounts of data.If the loop goes through less than one cache line of data per iteration, partially unroll the loop to obtainmultiple iterations of the loop within a cache line. Try to use virtually all of the prefetched data.

Thisusually requires unit-stride memory accesses—those in which all accesses are to contiguous memorylocations.5.15.7 Retain Intermediate Results in MMX™ RegistersKeep intermediate results in the MMX registers as much as possible, especially if the intermediateresults are used shortly after they have been produced. Avoid spilling intermediate results to memoryand reusing them shortly thereafter.23664-Bit Media Programming24592—Rev. 3.13—July 20076AMD64 Technologyx87 Floating-Point ProgrammingThis chapter describes the x87 floating-point programming model. This model supports all aspects ofthe legacy x87 floating-point model and complies with the IEEE 754 and 854 standards for binaryfloating-point arithmetic. In hardware implementations of the AMD64 architecture, support forspecific features of the x87 programming model are indicated by the CPUID feature bits, as describedin “Feature Detection” on page 279.6.1OverviewFloating-point software is typically written to manipulate numbers that are very large or very small,that require a high degree of precision, or that result from complex mathematical operations, such astranscendentals.

Applications that take advantage of floating-point operations include geometriccalculations for graphics acceleration, scientific, statistical, and engineering applications, and processcontrol.6.1.1 CapabilitiesThe advantages of using x87 floating-point instructions include:•••••Representation of all numbers in common IEEE-754/854 formats, ensuring replicability of resultsacross all platforms that conform to IEEE-754/854 standards.Availability of separate floating-point registers. Depending on the hardware implementation of thearchitecture, this may allow execution of x87 floating-point instructions in parallel with executionof general-purpose and 128-bit media instructions.Availability of instructions that compute absolute value, change-of-sign, round-to-integer, partialremainder, and square root.Availability of instructions that compute transcendental values, including 2x-1, cosine, partial arctangent, partial tangent, sine, sine with cosine, y*log2x, and y*log2(x+1).

The cosine, partial arctangent, sine, and sine with cosine instructions use angular values expressed in radians foroperands and results.Availability of instructions that load common constants, such as log2e, log210, log102, loge2, Pi, 1,and 0.x87 instructions operate on data in three floating-point formats—32-bit single-precision, 64-bitdouble-precision, and 80-bit double-extended-precision (sometimes called extended precision)—aswell as integer, and 80-bit packed-BCD formats.x87 instructions carry out all computations using the 80-bit double-extended-precision format.

Whenan x87 instruction reads a number from memory in 80-bit double-extended-precision format, thenumber can be used directly in computations, without conversion. When an x87 instruction reads anumber in a format other than double-extended-precision format, the processor first converts thex87 Floating-Point Programming237AMD64 Technology24592—Rev.

3.13—July 2007number into double-extended-precision format. The processor can convert numbers back to specificformats, or leave them in double-extended-precision format when writing them to memory.Most x87 operations for addition, subtraction, multiplication, and division specify two sourceoperands, the first of which is replaced by the result. Instructions for subtraction and division havereverse forms which swap the ordering of operands.6.1.2 OriginsIn 1979, AMD introduced the first floating-point coprocessor for microprocessors—the AM9511arithmetic circuit.

This coprocessor performed 32-bit floating-point operations under microprocessorcontrol. In 1980, AMD introduced the AM9512, which performed 64-bit floating-point operations.These coprocessors were second-sourced as the 8231 and 8232 coprocessors. Before then,programmers working with general-purpose microprocessors had to use much slower, vendor-suppliedsoftware libraries for their floating-point needs.In 1985, the Institute of Electrical and Electronics Engineers published the IEEE Standard for BinaryFloating-Point Arithmetic, also referred to as the ANSI/IEEE Std 754-1985 standard, or IEEE 754.This standard defines the data types, operations, and exception-handling methods that are the basis forthe x87 floating-point technology implemented in the legacy x86 architecture.

In 1987, the IEEEpublished a more general radix-independent version of that standard, called the ANSI/IEEE Std 8541987 standard, or IEEE 854 for short. The AMD64 architecture complies with both the IEEE 754 andIEEE 854 standards.6.1.3 Compatibilityx87 floating-point instructions can be executed in any of the architecture’s operating modes.

Existingx87 binary programs run in legacy and compatibility modes without modification. The supportprovided by the AMD64 architecture for such binaries is identical to that provided by legacy x86architectures.To run in 64-bit mode, x87 floating-point programs must be recompiled. The recompilation has no sideeffects on such programs, other then to make available the extended general-purpose registers and 64bit virtual address space.6.2RegistersOperands for the x87 instructions are located in x87 registers or memory. Figure 6-1 on page 239shows an overview of the x87 registers.238x87 Floating-Point Programming24592—Rev.

3.13—July 2007AMD64 Technologyx87 Data Registers790fpr0fpr1fpr2fpr3fpr4fpr5fpr6fpr7Instruction Pointer (rIP)ControlControlWordWordData Pointer (rDP)StatusStatusWordWord63Opcode10TagTagWordWord0150513-321.epsFigure 6-1.x87 RegistersThese registers include eight 80-bit data registers, three 16-bit registers that hold the x87 control word,status word, and tag word, two 64-bit registers that hold instruction and data pointers, and an 11-bitregister that holds a permutation of an x87 opcode.6.2.1 x87 Data RegistersFigure 6-2 on page 240 shows the eight 80-bit data registers in more detail.

Typically, x87 instructionsreference these registers as a stack. x87 instructions store operands only in these 80-bit registers or inmemory. They do not (with two exceptions) access the GPR registers, and they do not access the XMMregisters.x87 Floating-Point Programming239AMD64 Technology24592—Rev. 3.13—July 2007x87StatusWordST(6)fpr0ST(7)fpr1TOPST(0)fpr2ST(1)fpr3ST(2)fpr4ST(3)fpr5ST(4)fpr6ST(5)fpr71311790513-134.epsFigure 6-2.x87 Physical and Stack RegistersStack Organization.

The bank of eight physical data registers, FPR0–FPR7, are organized internallyas a stack, ST(0)–ST(7). The stack functions like a circular modulo-8 buffer. The stack top can be setby software to start at any register position in the bank. Many instructions access the top of stack aswell as individual registers relative to the top of stack.Stack Pointer.

Bits 13–11 of the x87 status word (“x87 Status Word Register (FSW)” on page 241)are the top-of-stack pointer (TOP). The TOP specifies the mapping of the stack registers onto thephysical registers. The TOP contains the physical-register index of the location of the top of stack,ST(0). Instructions that load operands from memory into an x87 register first decrement the stackpointer and then copy the operand (often with conversion to the double-extended-precision format)from memory into the decremented top-of-stack register. Instructions that store operands from an x87register to memory copy the operand (often with conversion from the double-extended-precisionformat) in the top-of-stack register to memory and then increment the stack pointer.Figure 6-2 shows the mapping between stack registers and physical registers when the TOP has thevalue 2. Modulo-8 wraparound addressing is used. Pushing a new element onto this stack—forexample with the FLDZ (floating-point load +0.0) instruction—decrements the TOP to 1, so thatST(0) refers to FPR1, and the new top-of-stack is loaded with +0.0.The architecture provides alternative versions of many instructions that either modify or do not modifythe TOP as a side effect.

Характеристики

Тип файла

PDF-файл

Размер

2,24 Mb

Материал

Intel and AMD manuals

Тип материала

Книга

Предмет

Архитектура ЭВМ

Высшее учебное заведение

МГУ им. Ломоносова

Список файлов книги

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.