Volume 3B System Programming Guide_ Part 2 (794104), страница 75
Текст из файла (страница 75)
Requests can be rejectedwhen the L2 cache is busy andresubmitted later or lost.All requests are counted, including thosethat are rejected.60HSeeTable18-7andTable18-8BUS_REQUEST_OUTSTANDING.(Core and BusAgents)Outstandingcacheable dataread busrequestsdurationThis event counts the number of pendingfull cache line read transactions on the busoccurring in each cycle. A read transactionis pending from the cycle it is sent on thebus until the full cache line is received bythe processor.The event counts only full-line cacheableread requests from either the L1 datacache or the L2 prefetchers. It does notcount Read for Ownership transactions,instruction byte fetch transactions, or anyother bus transaction.61HSeeTable18-8.BUS_BNR_DRV.(Bus Agents)Number of BusNot ReadysignalsassertedThis event counts the number of Bus NotReady (BNR) signals that the processorasserts on the bus to suspend additionalbus requests by other bus agents.Vol.
3 A-19PERFORMANCE-MONITORING EVENTSTable A-3. Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValueEvent NameDefinitionDescription andCommentA bus agent asserts the BNR signal whenthe number of data and snooptransactions is close to the maximum thatthe bus can handle. To obtain the numberof bus cycles during which the BNR signalis asserted, multiply the event count bytwo.While this signal is asserted, newtransactions cannot be submitted on thebus.
As a result, transaction latency mayhave higher impact on programperformance.62HSeeTable18-8BUS_DRDY_CLOCKS.(BusAgents)Bus cyclesThis event counts the number of buswhen data iscycles during which the DRDY (Datasent on the bus Ready) signal is asserted on the bus. TheDRDY signal is asserted when data is senton the bus. With the 'THIS_AGENT' maskthis event counts the number of buscycles during which this agent (theprocessor) writes data on the bus back tomemory or to other bus agents. Thisincludes all explicit and implicit datawritebacks, as well as partial writes.With the 'ALL_AGENTS' mask, this eventcounts the number of bus cycles duringwhich any bus agent sends data on thebus. This includes all data reads and writeson the bus.63HSeeTable18-7andTable18-8A-20 Vol. 3BUS_LOCK_CLOCKS.(Core andBus Agents)Bus cyclesThis event counts the number of buswhen a LOCKcycles, during which the LOCK signal issignal asserted asserted on the bus.
A LOCK signal isasserted when there is a locked memoryaccess, due to:• uncacheable memory• locked operation that spans two cachelines• page-walk from an uncacheable pagetableBus locks have a very high performancepenalty and it is highly recommended toavoid such accesses.PERFORMANCE-MONITORING EVENTSTable A-3. Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValue64H65HDescription andCommentEvent NameDefinitionSeeTable18-7BUS_DATA_RCV.(Core)Bus cyclesThis event counts the number of buswhile processor cycles during which the processor is busyreceives datareceiving data.SeeTable18-7andTable18-8BUS_TRANS_BRD.( Burst read busCore and BustransactionsAgents)This event counts the number of burstread transactions including:66HSeeTable18-7andTable18-8.BUS_TRANS_RFO.( RFO busCore and BustransactionsAgents)This event counts the number of Read ForOwnership (RFO) bus transactions, due tostore operations that miss the L1 datacache and the L2 cache.
It also counts RFObus transactions due to locked operations.67HSeeTable18-7andTable18-8.BUS_TRANS_WB.(Core and BusAgents)Explicitwriteback bustransactionsThis event counts all explicit writeback bustransactions due to dirty line evictions. Itdoes not count implicit writebacks due toinvalidation by a snoop request.68HSeeTable18-7andTable18-8BUS_TRANS_IFETCH.(Core andBus Agents)Instructionfetch bustransactionsThis event counts all instruction fetch fullcache line bus transactions.69HSeeTable18-7andTable18-8BUS_TRANS_INVAL.(Core andBus Agents)Invalidate bustransactionsThis event counts all invalidatetransactions.
Invalidate transactions aregenerated when:• L1 data cache read misses (and L1 datacache hardware prefetches)• L2 hardware prefetches by the DPL andL2 streamer• IFU read misses of cacheable lines.It does not include RFO transactions.• A store operation hits a shared line inthe L2 cache.• A full cache line write misses the L2cache or hits a shared line in the L2cache.Vol. 3 A-21PERFORMANCE-MONITORING EVENTSTable A-3. Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValue6AHSeeTable18-7andTable18-8BUS_TRANS_Partial writeThis event counts partial write busPWR.(Core and Bus bus transaction transactions.Agents)6BHSeeTable18-7andTable18-8BUS_TRANS_P.(Core and BusAgents)Partial bustransactionsThis event counts all (read and write)partial bus transactions.6CHSeeTable18-7andTable18-8BUS_TRANS_IO.(Core and BusAgents)IO bustransactionsThis event counts the number ofcompleted I/O bus transactions as a resultof IN and OUT instructions.
The count doesnot include memory mapped IO.6DHSeeTable18-7andTable18-8BUS_TRANS_DEF.(Core and BusAgents)Deferred bustransactionsThis event counts the number of deferredtransactions.6EHSeeTable18-7andTable18-8BUS_TRANS_BURST.(Core andBus Agents)Burst (fullcache-line) bustransactionsThis event counts burst (full cache line)transactions including:SeeTable18-7andTable18-8BUS_TRANS_Memory busMEM.(Core and Bus transactionsAgents)6FHA-22 Vol. 3Event NameDefinitionDescription andComment••••Burst readsRFOsExplicit writebacksWrite combine linesThis event counts all memory bustransactions including:• Burst transactions• Partial reads and writes - invalidatetransactionsThe BUS_TRANS_MEM count is the sum ofBUS_TRANS_BURST, BUS_TRANS_P andBUS_TRANS_IVAL.PERFORMANCE-MONITORING EVENTSTable A-3.
Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValue70H77H78HEvent NameDefinitionDescription andCommentSeeTable18-7andTable18-8BUS_TRANS_ANY.(Core and BusAgents)All bustransactionsThis event counts all bus transactions. Thisincludes:SeeTable18-7andTable18-11EXT_SNOOP.External(Bus Agents, Snoop snoopsResponse)SeeTable18-7andTable18-12CMP_SNOOP.(Core, L1 data cacheSnoop Type)snooped byother core••••Memory transactionsIO transactions (non memory-mapped)Deferred transaction completionOther less frequent transactions, suchas interruptsThis event counts the snoop responses tobus transactions.
Responses can becounted separately by type and by busagent.With the 'THIS_AGENT' mask, the eventcounts snoop responses from thisprocessor to bus transactions sent by thisprocessor. With the 'ALL_AGENTS' maskthe event counts all snoop responses seenon the bus.This event counts the number of times theL1 data cache is snooped for a cache linethat is needed by the other core in thesame processor. The cache line is eithermissing in the L1 instruction or datacaches of the other core, or is available forreading only and the other core wishes towrite the cache line.The snoop operation may change thecache line state.
If the other core issued aread request that hit this core in E state,typically the state changes to S state inthis core. If the other core issued a readfor ownership request (due a write miss orhit to S state) that hits this core's cacheline in E or S state, this typically results ininvalidation of the cache line in this core. Ifthe snoop hits a line in M state, the state ischanged at a later opportunity.Vol.
3 A-23PERFORMANCE-MONITORING EVENTSTable A-3. Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValueEvent NameDefinitionDescription andCommentThese snoops are performed through theL1 data cache store port. Therefore,frequent snoops may conflict withextensive stores to the L1 data cache,which may increase store latency andimpact performance.7AH7BH7DHSeeTable18-8BUS_HIT_DRV.SeeTable18-8BUS_HITM_DRV.SeeTable18-7BUSQ_EMPTY.(Bus Agents)(Bus Agents)(Core)HIT signalassertedThis event counts the number of buscycles during which the processor drivesthe HIT# pin to signal HIT snoop response.HITM signalassertedThis event counts the number of buscycles during which the processor drivesthe HITM# pin to signal HITM snoopresponse.Bus queueemptyThis event counts the number of cyclesduring which the core did not have anypending transactions in the bus queue.
Italso counts when the core is halted andthe other core is not halted.This event can count occurrences for thiscore or both cores.7EH7FHSeeTable18-7andTable18-8SNOOP_STALL_DRV.(Core and BusAgents)SeeTable18-7BUS_IO_WAIT.(Core)Bus stalled forsnoopsThis event counts the number of timesthat the bus snoop stall signal is asserted.To obtain the number of bus cycles duringwhich snoops on the bus are prohibited,multiply the event count by two.During the snoop stall cycles, no new bustransactions requiring a snoop responsecan be initiated on the bus. A bus agentasserts a snoop stall signal if it cannotresponse to a snoop request within threebus cycles.IO requestswaiting in thebus queueThis event counts the number of corecycles during which IO requests wait in thebus queue.
With the SELF modifier thisevent counts IO requests per core.With the BOTH_CORE modifier, this eventincrements by one for any cycle for whichthere is a request from either core.A-24 Vol. 3PERFORMANCE-MONITORING EVENTSTable A-3. Non-Architectural Performance Eventsin Processors Based on Intel Core Microarchitecture (Contd.)EventNumUmaskValueEvent NameDefinitionDescription andComment80H00HL1I_READSInstructionfetchesThis event counts all instruction fetches,including uncacheable fetches that bypassthe Instruction Fetch Unit (IFU).81H00HL1I_MISSESInstructionFetch UnitmissesThis event counts all instruction fetchesthat miss the Instruction Fetch Unit (IFU)or produce memory requests.