Диссертация (1137084), страница 26
Текст из файла (страница 26)
Model shown in Figure 3.11 specifies three general cases: onesuccessful case, when all stakeholders accept proposals (with 17 events), and two rejection cases(with 12 and 14 events). Other cases lead to deadlocks. The number of actual case variants in thelog is larger because of choices and interleaving of concurrent events. For example, in the processfunctional requirements can be proposed before or after of non-functional requirements.Figure 3.18: Distribution of tracesFigure 3.18 shows the distribution of number of traces by case variants.
One can see, that thereare three groups of cases. Approximately 700 traces are from the four most popular cases: 1, 2,3, 4. Several cases form the second group of less popular cases (numbered 5, 6, and 7). The overvariants (case types 8 to 16) are even less frequent. Obviously, short traces are the most popular.The longer the trace, the more variations it allows. Thus, the structure of a model together withthe generation settings influences the character of the distribution. One can specify preferences,which will lead to a single-case event log, by disabling all branches except the desired ones. Thedistribution shown in Figure 3.18 is typical for the generation with random choices on all gateways.Table 3.3 contains one-trace fragment from the generated event log.111Trace21Event1Activity NameNew Requirement FormulatedGroupCustomerRole-212Wait for ProposalCustomer-213New Requirements RequestArch.TeamAnalyst214Propose Changes of Internal Non-Func ReqsArch.TeamAnalyst215Propose Changes of Internal Func ReqsArch.TeamAnalyst216Adjust Requirement ChangesArch.TeamAnalyst217Check Requirements ChangesArch.TeamArchitect218Propose Architectural ChangesArch.TeamArchitect219Calculate CostArch.TeamAnalyst2110Send ProposalArch.TeamAnalyst2111Receive ProposalCustomer-2112Reject ProposalCustomer-2113Analyze ResultsCustomer-2114Proposal RejectedArch.TeamAnalystTimestamp / Data Variables2016-02-24T22:02:24.455+03:00[-;"Functional Reqs";-;-;-]2016-02-24T23:08:16.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T00:01:00.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T01:49:28.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T01:55:24.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T02:58:31.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T04:45:48.455+03:00[-;"Functional Reqs";-;-;-]2016-02-25T05:48:38.455+03:00[-;"Functional Reqs";203.37;-;-]2016-02-25T07:01:46.455+03:00[-;"Func.Reqs";203.37;426.75;-]2016-02-25T08:04:16.455+03:00[-;"Func.Reqs";203.37;426.75;-]2016-02-25T09:33:47.455+03:00[14.1;"Func.Reqs";203.37;426.75;-]2016-02-25T11:25:35.455+03:00[14.1;"Func.Reqs";203.37;426.75;-]2016-02-25T12:28:49.455+03:00[14.1;"Func.Reqs";203.37;426.75;-]2016-02-25T12:46:36.455+03:00[14.1;"Func.Reqs";203.37;426.75;-]Table 3.3: One-trace fragment of the example generated event logIn order to test robustness and effectiveness of the approach, the presented tool has been testedon models selected from Signavio BPMN Reference Models Collection12 .
This collection containsseveral thousands (more than 4700) of real-life process models made by experts. These modelswere collected by Signavio from many of application areas and are usually used to test analysisalgorithms in the BPM field.Approximately 3000 of collection models satisfy the restrictions of the formal frameworkdescribed in Section 3.3. The most popular construct, which is not supported in our framework,is a message event. In general, these events are a type of syntactic sugar, they add no essentialaspects for the modelling. Another non-supported type of elements is timer event. However, weplan to support these elements in the next version of our tool.Model IDNumber ofElements(Events)Activities/Gateways(XOR+AND)/FlowsPools/LanesTraceVariantsTotal Events(Classes)8914774_rev1100193728_rev31073989552_rev11754711371_rev1186202353_rev11280191109_rev7604624904_rev11391700380_rev11849720729_rev32012957934_rev3111617183028271812215 / 4+0 / 1210 / 2+2 / 199 / 2+2 / 1710 / 4+2 / 2016 / 4+8 / 3522 / 3+0 / 3017 / 0+8 / 1510 / 4+2 / 206 / 2+0 / 1113 / 2+3 / 230101001010310005665710003663282511 (5)30930 (10)5072 (9)5905 (10)18024 (16)896988 (22)17000 (17)5962 (10)3505 (6)7228 (13)(2)(2)(4)(2)(2)(3)(2)(2)(4)(3)//////////0402004030Trace LengthMin221341589017431Mean331561889717647Max3647957904179435Table 3.4: Simulation statistics for 10 different models selected from Signavio collectionThe event log generator works correctly and robustly on 956 of satisfying models.
For othermodels the tool fails to return traces after 1000 attempts. These models can not terminate becauseof deadlocks (process executions leading to the states, in which no node is enabled) or livelocks(executions inevitably ending up in a loop of repeating tasks without possibility to reach the endstate). For each model the test script generated an event log of 500 traces using our tool. The12Signavio models collection: http://www.signavio.com/reference-models/112whole test procedure (for all models from the Signavio Collection) took 646065 milliseconds (thisis approximately 10 minutes) on a typical desktop computer.
The total number of generated tracesis 478000. During the generation 10520 deadlocks and 198663 livelocks were identified. Our toolsuccessfully simulated models from Signavio Collection, which satisfy the restrictions of the formalframework, and contain no deadlocks or livelocks.Figure 3.19: Model 186202353_rev1 from Signavio BPMN reference models collectionTable 3.4 shows generation statistics for 10 different models, which were randomly selectedfrom Signavio Collection. 1000 traces were generated for each model.
A reader can see therelationship between model structure (number of activities, gateways, lanes) and the variabilityof characteristics of generated event logs.Figure 3.20: Event log generated for the model 186202353_rev1113In Figure 3.19 the 186202353_rev1 model is shown.
This is a typical model for the collection.Note that most models from the Signavio Collection are smaller, than the one shown in Figure 3.11.For a hypothetical collection of large models and mazy models simulation time will be much longer.Figure 3.20 shows the view with characteristics of the event log that was generated for this model.3.4ConclusionsThis chapter presented the approach for event log generation, and the tool that implementsthis approach. In particular, Section 3.2 describes the generator for Petri net models, whereas anextension of this tool for BPMN 2.0 models is described in Section 3.3.The main users for the approach considered in Section 3.2 are process miners and developersof new algorithms for process discovering and analysis are interested in sample-generation tool.The main features of the presented plug-in are the following:(1) A user can easily generate a set of event logs with additional noise.(2) Generation settings allow users to decide how many event logs will be generated, how manytraces will these logs include.
In order to prevent loops which will not terminate, a user isasked about a maximum number of steps during algorithm execution. All event logs will begenerated within single execution of the plug-in. By default the tool generates 5 event logswhile every log consists of 10 traces and it does at most 100 steps.(3) In cases when several outputs from one place are available, it gives the possibility for flexiblemodifications of simulated behaviour which bring the higher accuracy of model behaviourdescribing the real world processes.(4) It is possible to separate the start of a transition and the termination of a transition inevent log records. Furthermore, in such a case users can define time of execution for everytransition and how punctual they are executed by defining deviations bounds.(5) The tool can create both event logs which completely fit the given model, and the logs withnoise added.The approach has been implemented as a software plug-in for ProM 6 Framework [26].