TPC BENCHMARK (TM) H (779138), страница 19
Текст из файла (страница 19)
These parameters have to be set once before anyquery or refresh function is run and left in that setting for the duration of the performance test.The configuration and initialization of the SUT, the database, or the session, including any relevant parameter,switch or option settings, must be based only on externally documented capabilities of the system that can be reasonably interpreted as useful for an ad-hoc decision support workload. This workload is characterized by:Sequential scans of large amounts of data;Aggregation of large amounts of data;Multi-table joins;Possibly extensive sorting.While the configuration and initialization can reflect the general nature of this expected workload, it shall not takespecial advantage of the limited functions actually exercised by the benchmark.
The queries actually chosen in thebenchmark are merely examples of the types of queries that might be used in such an environment, not necessarilythe actual user queries. Due to this limit in the number and scope of the queries and test environment, TPC-H haschosen to restrict the use of some database technologies (see Clause 1.5 ). In general, the effect of the configurationon benchmark performance should be representative of its expected effect on the performance of the class ofapplications modeled by the benchmark.Furthermore, the features, switches or parameter settings that comprise the configuration of the operating system,the DBMS or the session must be such that it would be reasonable to expect a database administrator with the following characteristics be able to decide to use them:Knowledge of the general characteristics of the workload as defined above;Knowledge of the logical and physical database layout;Access to operating system and database documentation;No knowledge of product internals beyond what is externally documented externally.Each feature, switch or parameter setting used in the configuration and initialization of the operating system, theDBMS or the session must meet the following criteria:It shall remain in effect without change throughout the performance test;It shall not make reference to specific tables, indices or queries for the purpose of providing hints to thequery optimizer.TPC BenchmarkTM H Standard Specification Revision 2.17.1Page 935.2.8The gathering of statistics is part of the database load (see Clause 4.3) but it also serves as an important configuration vehicle, particularly for the query optimizer.
In order to satisfy the requirements of Clause 5.2.7, it is desirableto collect the same quality of statistics for every column of every table. However, in order to reduce processingrequirements, it is permissible to segment columns into distinct classes and base the level of statistics collection for aparticular column on class membership. Class definitions must rely solely on schema-related attributes of a columnand must be applied consistently across all tables. For example:Membership in an index;Leading or other position in an index;Use in a constraint (including a primary key or foreign key constraints).Statistics that operate in sets, such as distribution statistics, should employ a fixed set appropriate to the scale factorused.
Knowledge of the cardinality, values or distribution of a non-key column as specified in Clause 4: cannot beused to tailor statistics gathering.5.2.9Special rules apply to the use of so-called profile-directed optimization (PDO), in which binary executables arereordered or otherwise optimized to best suit the needs of a particular workload. These rules do not apply to the routine use of PDO by a database vendor in the course of building commercially available and supported databaseproducts; such use is not restricted.
Rather, the rules apply to the use of PDO by a test sponsor to optimize executables of a database product for a particular workload. Such optimization is permissible if all of the following conditions are satisfied:1.The use of PDO or similar procedures by the test sponsor must be disclosed.2.The procedure and any scripts used to perform the optimization must be disclosed.3.The procedure used by the test sponsor could reasonably be used by a customer on a shipped database executable.4.The optimized database executables resulting from the application of the procedure must be supported by thedatabase software vendor.5.The workload used to drive the optimization is as described in Clause 5.2.10.6.The same set of DBMS executables must be used for all phases of the benchmark.5.2.10If profile-directed optimization is used under the circumstances described in Clause 5.2.9, the workload used todrive it must be the (possibly repeated) execution of Queries 1,2,4 and 5 in any order, against a TPC-H database ofany desired Scale Factor with default substitution parameters applied.5.3Execution Rules5.3.1General Rules5.3.1.1 The driver must submit queries through one or more sessions on the SUT.
Each session corresponds to one, and onlyone, query stream on the SUT.5.3.1.2 Parallel activity within the SUT directed toward the execution of a single query (i.e., intra-query parallelism) is notrestricted.5.3.1.3 To measure the performance of a system using the TPC Benchmark™ H, the test sponsor will execute runs composed of:A power test, to measure the raw query execution power of the system when connected with a single activeuser. In this test, a single pair of refresh functions are executed exclusively by a separate refresh stream andscheduled before and after the execution of the queries (see Clause 5.3.3);A throughput test, to measure the ability of the system to process the most queries in the least amount oftime.
In this test, several pairs of refresh functions are executed exclusively by a separate refresh streamand scheduled as defined by the test sponsor.Comment: The throughput test is where test sponsors can demonstrate the performance of their systems against amulti-user workload.TPC BenchmarkTM H Standard Specification Revision 2.17.1Page 945.3.1.4 The performance test follows the load test. However, any system activity that takes place between the completion ofthe load test (see Clause 5.1.1.2) and the beginning of the performance test is limited to that which is not likely toimprove the results of the subsequent performance test. All such activity must be disclosed (see Clause 8.3.7.1).Examples of acceptable activity include but are not limited to:Execution of scripts or queries requested by the auditor;Processing or archiving of files or timing data gathered during the load test;Configuration of performance monitoring tools;Execution of simple queries to verify that the database is correctly loaded;Taking database backups (if not needed to meet the ACID requirements);Rebooting the SUT or restarting the RDBMS.5.3.1.5 The power test and the throughput test must both be executed under the same conditions, using the same hardwareand software configuration and the same data manager and operating system parameters.
All such parameters mustbe reported.Comment: The intent of this Clause is to require that both tests (i.e., the power and throughput tests) be run in identical conditions except for the number of query streams and the scheduling of the refresh functions within the refreshstream.5.3.1.6 For each query, at least one atomic transaction must be started and completed.Comment: The intent of this Clause is to specifically prohibit the execution of an entire query stream as a singletransaction.5.3.1.7 Each refresh function must consist of at least one atomic transaction. However, logically consistent portions of therefresh functions may be implemented as separate transactions as defined in Clause 2.5.Comment: This intent of this Clause is to specifically prohibit the execution of multiple refresh functions as a singletransaction.
The splitting of each refresh function into multiple transactions is permitted to encourage "trickle"updates performed concurrently with one or more query streams in the throughput test.5.3.2Run SequencingThe performance test consists of two runs. If Run 1 is a failed run (see Clause 5.1.1.6) the benchmark must berestarted with a new load test. If Run 2 is a failed run, it may be restarted without a reload. The reported performance metric must be for the run with the lower TPC-H Composite Query-Per-Hour Performance Metric.