TPC BENCHMARK (TM) H (779138), страница 16
Текст из файла (страница 16)
For example, if the test database employs horizontal partitioning (see Clause 1.5.4), then the qualificationdatabase must also employ horizontal partitioning, though the number of partitions may differ in each case. Asanother example, the qualification database could be configured such that it uses a representative sub-set of theprocessors/cores/threads, memory and disks used by the test database configuration. If the qualification databaseconfiguration differs from the test database configuration in any way, the differences must be disclosed (see Clause8.3.6.8).4.1.2.2 The population of the qualification database must be exactly equal to a scale factor, SF, of 1 (see Clause 4.1.3 for adefinition of SF).4.1.3Database Scaling Requirements4.1.3.1 Scale factors used for the test database must be chosen from the set of fixed scale factors defined as follows:1, 10, 30, 100, 300, 1000, 3000, 10000, 30000, 100000The database size is defined with reference to scale factor 1 (i.e., SF = 1; approximately 1GB as per Clause 4.2.5),the minimum required size for a test database.
Therefore, the following series of database sizes corresponds to theseries of scale factors and must be used in the metric names QphH@Size and Price-per-QphH@Size (see Clause5.4), as well as in the executive summary statement (see Appendix E):1GB, 10GB, 30GB, 100GB, 300GB, 1000GB, 3000GB, 10000GB, 30000GB, 100000GBWhere GB stands for gigabyte, defined to be 230 bytes.Comment 1: Although the minimum size of the test database for a valid performance test is 1GB (i.e., SF = 1), atest database of 3GB (i.e., SF = 3) is not permitted. This requirement is intended to encourage comparability ofresults at the low end and to ensure a substantial actual difference in test database sizes.TPC BenchmarkTM H Standard Specification Revision 2.17.1Page 79Comment 2: The maximum size of the test database for a valid performance test is currently set at 100000 (i.e., SF= 100,000).
The TPC recognizes that additional benchmark development work is necessary to allow TPC-H to scalebeyond that limit.4.1.3.2 Test sponsors must choose the database size they want to execute against by selecting a size and corresponding scalefactor from the defined series.4.1.3.3 The ratio of total data storage to database size r must be computed by dividing the total durable data storage of thepriced configuration (expressed in GB) by the size chosen for the test database as defined in the scale factor used forthe test database.
The reported value for the ratio v must be rounded to the nearest 0.01. That is, v=round(r,2). Theratio must be included in both the Full Disclosure report and the Executive Summary.4.2DBGEN and Database Population4.2.1The DBGEN Program4.2.1.1 The test database and the qualification database must be populated with data that meets the requirements of Clause4.2.2 and Clause 4.2.3. DBGen is a TPC provided software package that must be used to produce the data used topopulate the database..4.2.1.2 The data generated by DBGen are meant to be compliant with the specification as per Clause 4.2.2 and Clause 4.2.3.In case of differences between the content of these two clauses and the data generated by DBGen, the specificationprevails.4.2.1.3 The TPC Policies Clause 5.3.1 requires that the version of the specification and DBGen must match.
It is the testsponsor’s responsibility to ensure the correct version of DBGen is used.4.2.1.4 DBGen has been tested on a variety of platforms. Nonetheless, it is impossible to guarantee that DBGen isfunctionally correct in all aspects or will run correctly on all platforms. It is the Test Sponsor's responsibility toensure the TPC provided software runs in compliance with the specification in their environment(s).4.2.1.5 If a Test Sponsor must correct an error in DBGen in order to publish a Result, the following steps must beperformed:a.
The error must be reported to the TPC administrator, following the method described in clause 4.2.1.7, nolater than the time when the Result is submitted.b. The error and the modification (i.e. diff of source files) used to correct the error must be reported in theFDR as described in clause 8.3.5.5.c. The modification used to correct the error must be reviewed by a TPC-Certified Auditor as part of the auditprocess.Furthermore any consequences of the modification may be used as the basis for a non-compliance challenge.4.2.2Definition Of Terms4.2.2.1 The term random means independently selected and uniformly distributed over the specified range of values.4.2.2.2 The term unique within [x] represents any one value within a set of x values between 1 and x, unique within thescope of rows being populated.4.2.2.3 The notation random value [x ..
y] represents a random value between x and y inclusively, with a mean of (x+y)/2,and with the same number of digits of precision as shown. For example, [0.01 .. 100.00] has 10,000 unique values,whereas [1..100] has only 100 unique values.4.2.2.4 The notation random string [list_name] represents a string selected at random within the list of strings list_name asdefined in Clause 4.2.2.13. Each string must be selected with equal probability.4.2.2.5 The notation text appended with digit [text, x] represents a string generated by concatenating the sub-string text,the character "# ", and the sub-string representation of the number x.TPC BenchmarkTM H Standard Specification Revision 2.17.1Page 804.2.2.6 This clause intentionally left blank.4.2.2.7 The notation random v-string [min, max] represents a string comprised of randomly generated alphanumericcharacters within a character set of at least 64 symbols.
The length of the string is a random value between min andmax inclusive.4.2.2.8 The term date represents a string of numeric characters separated by hyphens and comprised of a 4 digit year, 2 digitmonth and 2 digit day of the month.4.2.2.9 The term phone number represents a string of numeric characters separated by hyphens and generated as follows:Let i be an index into the list of strings Nations (i.e., ALGERIA is 0, ARGENTINA is 1, etc., see Clause 4.2.3),Let country_code be the sub-string representation of the number (i + 10),Let local_number1 be random [100 .. 999],Let local_number2 be random [100 ..
999],Let local_number3 be random [1000 .. 9999],The phone number string is obtained by concatenating the following sub-strings:country_code, "-", local_number1, "-", local_number2, "-", local_number34.2.2.10 The term text string[min, max] represents a substring of a 300 MB string populated according to the pseudo textgrammar defined in Clause 4.2.2.14. The length of the substring is a random number between min and maxinclusive. The substring offset is randomly chosen.4.2.2.11 This clause intentionally left blank.4.2.2.12 All dates must be computed using the following values:STARTDATE = 1992-01-01CURRENTDATE = 1995-06-17ENDDATE = 1998-12-314.2.2.13 The following list of strings must be used to populate the database:List name:TypesEach string is generated by the concatenation of a variable length syllable selected at random from each of the threefollowing lists and separated by a single space (for a total of 150 combinations).Syllable 1Syllable 2Syllable 3STANDARDANODIZEDTINSMALLBURNISHEDNICKELMEDIUMPLATEDBRASSLARGEPOLISHEDSTEELECONOMYBRUSHEDCOPPERPROMOList name: ContainersEach string is generated by the concatenation of a variable length syllable selected at random from each of the twofollowing lists and separated by a single space (for a total of 40 combinations).Syllable 1Syllable 2SMCASETPC BenchmarkTM H Standard Specification Revision 2.17.1Page 81LGBOXMEDBAGJUMBOJARWRAPPKGPACKCANDRUMList name: SegmentsAUTOMOBILEBUILDINGFURNITUREMACHINERY2-HIGH3-MEDIUM4-NOT SPECIFIEDHOUSEHOLDList name: Priorities1-URGENT5-LOWList name: InstructionsDELIVER IN PERSONCOLLECT CODNONETAKE BACK RETURNREG AIRAIRRAILSHIPTRUCKMAILFOBfoxesideastheodolitespinto beansinstructionsdependenciesexcusesplateletsasymptotescourtsdolphinsmultiplierssauterneswarthogsfretsdinosattainmentssomasTiresias'patternsforgesbraidshockey playersfrayswarhorsesdugoutsnotornisepitaphsList name: ModesList name:NounsTPC BenchmarkTM H Standard Specification Revision 2.17.1Page 82pearlstitheswatersorbitsgiftssheavesdepthssentimentsdecoysrealmspainsgrouchessleepwakearecajolehagglenaguseboostaffixdetectintegratemaintainnodwaslosesublatesolvethrashpromiseengagehinderprintx-raybreacheatgrowimpressmoldpoachserverundazzlesnoozedozeunwindkindleplayhangbelievedoubtfuriousslycarefulblithequickfluffyslowquietruthlessthinclosedoggeddaringbravestealthypermanententicingidlebusyregularfinalironicevenboldsometimesalwaysneverfuriouslyslylycarefullyblithelyquicklyfluffilyslowlyquietlyruthlesslythinlycloselydoggedlydaringlyescapadesList name: VerbsList name: AdjectivessilentList name: AdverbsTPC BenchmarkTM H Standard Specification Revision 2.17.1Page 83bravelystealthilypermanentlyenticinglyidlybusilyregularlyfinallyironicallyevenlyboldlysilentlyaboutaboveaccording toacrossafteragainstalongalongside ofamongaroundatatopbeforebehindbeneathbesidebesidesbetweenbeyondbydespiteduringexceptforfromin place ofinsideinstead ofintonearofonoutsideoverpastsincethroughthroughouttotowardunderuntilupuponwithoutwithwithindomaymightshallwillwouldcancouldshouldought tomustwill have toshall have tocould have toshould have tomust have toneed totry toList name: PrepositionsList name: AuxiliariesList name: Terminators.;!--:?4.2.2.14 Pseudo text used in the data population (see Clause 4.2.2.10) must conform to the following grammar:text:<sentence>|<text> <sentence>;TPC BenchmarkTM H Standard Specification Revision 2.17.1Page 84sentence:<noun phrase> <verb phrase> <terminator>|<noun phrase> <verb phrase> <prepositional phrase> <terminator>|<noun phrase> <verb phrase> <noun phrase> <terminator>|<noun phrase> <prepositional phrase> <verb phrase><noun phrase> <terminator>|<noun phrase> <prepositional phrase> <verb phrase><prepositional phrase> <terminator>;noun phrase:<noun>|<adjective> <noun>|<adjective>, <adjective> <noun>|<adverb> <adjective> <noun>;verb phrase:<verb>|<auxiliary> <verb>|<verb> <adverb>|<auxiliary> <verb> <adverb>;prepositional phrase: <preposition> the <noun phrase>;noun:selected from Nouns (as defined in Clause 4.2.2.13)verb: selected from Verbs (as defined in Clause 4.2.2.13)adjective: selected from Adjectives (as defined in Clause 4.2.2.13)adverb: selected from Adverbs (as defined in Clause 4.2.2.13)preposition: selected from Prepositions (as defined in Clause 4.2.2.13)terminator: selected from Terminators (as defined in Clause 4.2.2.13)auxiliary: selected from Auxiliary (as defined in Clause 4.2.2.13)4.2.2.15 The grammar defined in Clause 4.2.2.14 relies on the weighted, non-uniform distribution of its constituent distributions (nouns, verbs, auxiliaries, etc.).4.2.3Test Database Data GenerationThe data generated by DBGEN (see Clause 4.2.1) must be used to populate the database as follows (where SF is thescale factor, see Clause 4.1.3.1):SF * 10,000 rows in the SUPPLIER table with:S_SUPPKEY unique within [SF * 10,000].S_NAME text appended with minimum 9 digits with leading zeros ["Supplie#r", S_SUPPKEY].S_ADDRESS random v-string[10,40].S_NATIONKEY random value [0 ..