Диссертация (1137084), страница 7
Текст из файла (страница 7)
Note that this is the first case fromthe event log discussed above. This example contains no timestamps and resources. However, theseand other attributes can be stored in XES-files using so-called extensions. A file in XES format34Official page: http://www.win.tue.nl/ieeetfpm/doku.php?id=startMXML web-page: http://www.processmining.org/logs/mxml29begins with the list of extensions applied to the event log. For example, Lifecycle and Conceptextensions are used in the log from Figure 1.3.
Each extension define possible attributes of eventsand traces in a log.Real-life event logs can be rather large. For example, event logs generated by real informationsystems, which are used for the tasks of the Business Process Intelligence Challenge5 , have sizefrom tens of MBs to tens of GBs6 .
Usually, such event logs are stored in databases of differentdesign since they can not be processed in text-based formats.1.2Overview of Process Mining TechniquesIn the remainder of this section we give an overview of the main process mining techniques.The next section discusses process model repair which is an important problem in process mining.Process mining is a fast-growing research area at the intersection of formal methods and datascience [5]. Other related fields are business process management [33, 34], information systemarchitecture and design [1, 35, 36], formal software design and dynamic analysis [37–40].
The titlehas been coined in the early 2000s by Wil van der Aalst7 of Technische Universiteit Eindhoven8 .At the time, when this thesis is prepared, the most detailed and comprehensive description ofprocess mining techniques can be found in the second edition of the Process Mining book [5].DiscoveryConformanceModelEvent LogEnhancementFigure 1.4: Process miningProcess mining provides methods for analysis of processes in information systems using eventlogs which are recorded during process lifetime. Figure 1.4 shows the three main classes of processmining algorithms: process discovery, conformance checking, and process enhancement.5BPIC page: https://www.win.tue.nl/bpi/doku.php?id=2017:challengeSee, for example, the following real-life event logs:https://doi.org/10.4121/uuid:5f3067df-f10b-45da-b98b-86ae4c7a310b orhttps://doi.org/10.4121/uuid:360795c8-1dd6-4a5b-a443-185001076eab7Related approach has been also proposed by J.
Cook and A. Wolf [41]. They formulated the idea of programmodel discovery from the sequences of symbols which represent the software-generated events.8Nowadays, prof. van der Aalst holds the Chair of process and data science (PADS) at RWTH Aachen University:http://pads.rwth-aachen.de/630A plenty of algorithms have been introduced for discovering process models from event logs. Aprocess discovery algorithm automatically constructs a process model based on a given event logwith recorded process behaviour.
In contrast to manually constructed models, discovered modelsrepresent observed process behaviour rather than human-expert assumptions about the modelledprocess. Then, a discovered model can be analysed or verified. The strength of process miningis that it provides researchers with automated techniques to reveal causal dependencies betweenprocess activities. Section 1.2.1 describes process discovery in detail.Algorithms for checking conformance between event logs and process models provide a userwith techniques to measure how a model conforms with the observed behaviour, to detect processbottlenecks and inconsistencies. Depending on what algorithm is used, the conformance betweena model and an event log can be measured as a numerical value or represented as a complexconformance report.
We describe commonly used conformance checking techniques and mainconformance criteria in Section 1.2.2.The third main sub-field of process mining is called process enhancement or extension. Usually,the process owner’s main goal is to improve the performance and value of processes based onevidence obtained during the analysis. Process enhancement techniques combine information froman event log with expert’s domain knowledge and heuristics, additional process data, and processdiagnostic data to achieve better conformance or performance of a process model.Note that there is a substantial difference between process enhancement and process modelenhancement. To enhance processes business process management (BPM) approaches andmethodologies are employed with the help of automated tools which support performance analysisand incremental process re-design.
Process model enhancement algorithms improve formal processmodels of various type according to the specified quality criteria. These are methods for modelsimplification, repair, adaptation, extension, transformation. Usually, results of conformanceevaluation and additional data attributes of event logs are an input for these methods.1.2.1Process DiscoveryProcess discovery is the important part of process mining. A discovery algorithm automaticallysynthesizes a process model base on an event log (see Figure 1.5). Various discovery algorithmshave been developed [23,24,42–50] which base on different principles and deal with many modellingformalism.ProcessDiscoveryEvent LogModelFigure 1.5: Process discovery31The -algorithm is one of the first and simple discovery algorithms [42]. This algorithmautomatically constructs a workflow net corresponding to a given event log.
It is based on findingbasic causal relations between events. This algorithm has been developed by W. van der Aalstand B. van Dongen [51]. Basic sequential and concurrent workflow patterns can be automaticallymodelled. However, there are complex workflow patterns which can not be synthesized by algorithm. Besides, the algorithm ignores a frequency of different behaviour types. Thus, themethod is not often used in practice. Nevertheless, more sophisticated algorithms base on thebasic ideas of this method. Figure 1.6 shows a model discovered from the event log (1.1) using-algorithm.Figure 1.6: Model discovered using -algorithmJ. Carmona, J.
Cortadella, M. Kishinevsky et al. [52–55] have developed the fundamentaltechnique for a Petri net discovery based on the theory of regions [56, 57]. These algorithmssynthesise a Petri net from an intermediate transition system which, in turn, can be constructedfrom an event log. It is a complex problem to balance between underfitting and overfitting modelwhen region-based techniques are used [58]. When applied to noisy and incomplete event logs theregion-based techniques often synthesise complex and overfitted Petri nets. Besides, M. Solé andJ. Carmona have developed region-based techniques which are applicable in real-life cases [59–62].The ILP miner reduces discovery to the integer linear programming problem [24, 51].
Itsimproved version, which is based on the theory of regions, has been proposed recently byS. van Zelst et al. [63] Figure 1.7 shows a model discovered from the event log (1.1) usingILP Miner. Note that this model is not a workflow net. Nevertheless, it is a Petri net that is ableto replay all traces of .Figure 1.7: Model discovered using ILP miner32C. Günther and W. van der Aalst proposed Fuzzy miner algorithm [44].
This algorithmconstructs process models that are based on the map-metaphor. A user of the algorithm canabstract and specify a discovered model in a seamless manner. A fuzzy model can be zoomed inand out to show more or less detailed process description. Infrequent activities or flow relationscan be hidden.
Many commercial process mining tools are based on this algorithm. Figure 1.8shows a model discovered using the fuzzy algorithm from the event log (1.1).Figure 1.9: Model discovered using Heuristics minerFigure 1.8: Model discoveredusing Fuzzy minerAnother widely used discovery algorithm is Heuristics miner [45, 64, 65]. This algorithm usessimilar ideas as the fuzzy algorithm. Both algorithms take frequencies of events and causalrelationships between them into account. Figure 1.9 shows a model discovered using Heuristicsminer from the same event log .Note that these two algorithms use specific formalisms to represent process models: fuzzy mapsand heuristics nets [65].
Fortunately, heuristics nets can be easily transformed into Petri nets. Forexample, Figure 1.10 shows a Petri net corresponding to the heuristics net depicted in Figure 1.9.Again, this model is not a workflow net.Figure 1.10: Model discovered using Heuristics miner (see Figure 1.9) and transformed into aPetri netInductive mining is a relatively new discovery method [23,47,66]. It aims at discovering blockstructured process models, which can be represented as process trees, workflow nets, or structuredBPMN models.