Диссертация (1137084), страница 3
Текст из файла (страница 3)
Spring/Summer Young Researchers’ Colloquium on Software Engineering 2015 (SYRCoSE2015). Samara, Povolzhskiy State University of Telecommunications and Informatics, 2830.05.2015. Talk: Iskra: A Tool for Process Model Repair.7. Special Seminar «Процессно-ориентированные информационные системы» (“ProcessAware Information Systems”). Moscow Oblast’, Voronovo, 28-29.11.2015. Talk: Об исправлении моделей процессов (On Process Model Repair).8. Seminar of PAIS Laboratory. Moscow, NRU HSE Faculty of Computer Science, 12.10.2015.Talk: Плохие и хорошие модели бизнес-процессов (Bad and Good Business ProcessModels).9.
Spring/Summer Young Researchers’ Colloquium on Software Engineering 2014 (SYRCoSE2014). St. Petersburg, Peter the Great St. Petersburg Polytechnic University, 29-31.05.2014.Talk: Generation of a Set of Event Logs with Noise.Thesis OutlineThe thesis consists of the four main chapters, the introduction, and the concluding chapter.Introduction discusses the relevance of thesis subject, informally presents its main problem,contains necessary formal sections, and shortly outlines the thesis content.10The first chapter describes the background of this thesis.Firstly, Petri nets are defined. This is the language that is used in order to model processes inthis thesis. Petri net is a directed bipartite graph with two node types.
Transitions model activitiesof a process, and places are used to specify the order of these activities. In particular, it is possibleto model all basic process patterns: sequence, choice, parallel execution, and loops. These nodesare connected with arcs.The current state of a Petri net is denoted using a so-called marking. A marking is a distributionof tokens through net places. A transition of a Petri net may fire if there are enough tokens inall of its input places.
Firing of a transition consumes tokens from its input places and producestokens to its output places. Thus, a transition firing changes the marking of the net.In this thesis, workflow nets are used. They form a special class of Petri nets. A workflow nethas distinguished initial (source) and final (sink) places. The initial marking of a workflow netis a single token in the source, whereas in the final marking all places are clear except the sinkone. All possible runs in such a net start from the initial marking and end in the final marking.Figure 3 shows an example of a workflow net. A black token in the source place denotes the initialmarking.ht8asourcet1bc1t2c2cft3t6dec3t5t4c4gsinkt7Figure 3: Simple workflow netSecondly, event logs are defined.
Let ⊆ be a set of process activities. They are used tolabel model transitions. We assume that an event in a log is a name of an activity, i.e. additionalevent attributes are not used. Thus, an event represents an activity in a trace of an event log.We define a trace as a finite sequence of activities from , i.e. ∈ ∗ . An event log (or log) is a finite multi-set of traces, i.e. ∈ ℬ(∗ ). For example, = (︀∐︀,,,,̃︀3 , ∐︀,,,,̃︀5 , ∐︀,,,,̃︀⌋︀is an event log with the same activities as in the model shown in Figure 3.Thirdly, process mining and its three sub-fields are described.
Process mining is a researcharea at the intersection of formal methods and data science [5]. It provides methods for analysis ofprocesses in information systems based on event logs, which are recorded during their lifetime. Thethree main sub-fields of process mining are process discovery, conformance checking, and processenhancement. Algorithms from each of these sub-fields are employed in this thesis.Process discovery algorithms are able to synthesize a process model automatically from a givenevent log. Conformance checking aims at evaluating the model’s conformance to the observed11behaviour that is recorded in an event log.
Finally, both processes and process models can beenhanced. Process enhancement provides methods to improve real-life processes according tospecified performance criteria. Usually, these methods are less formal, and business-oriented.Process model enhancement methods and techniques to improve models are more formal. Theprocess model repair (or simply model repair ) belongs to this category of process mining methods[5].The four main numerical criteria of a process model are described in this chapter: fitness,precision, generalization, simplicity. These criteria are common in process mining [5].The first two are used for checking the conformance between a model and an event log. Fitnessshows to what extent log traces can be replayed by a model.
Precision determines how well themodel can generate behaviour that is not represented in the log. Generalization and simplicitycharacterize the properties of the model itself. Generalization shows the level of model abstraction,and simplicity shows its compactness.This thesis considers the problem of process model repair according to its fitness. Precisioncriterion is also employed, when repaired models are evaluated.
In this chapter, concrete methodsfor fitness and precision measurement are described in detail. In this thesis, an alignment-basedtechnique to check log-model conformance [22] is used, since this technique is the most conventionalin process mining at the moment.This chapter also describes the different process discovery algorithms [5]. Two of them areselected to be used in this thesis: Inductive miner [23] and ILP miner [24].
The main reason touse these algorithms with carefully specified settings is that both of them can guarantee perfectfitness of a discovered model.Fourthly, this chapter also reviews the model repair methods proposed earlier by otherresearches. Particular problem statements, solutions and their features are considered.The approach proposed in this thesis has its specific features. The closest approach has beenproposed by D. Fahland and W. van der Aalst [25].
However, we employ model decompositiontechniques for the subsequent replacement of non-conforming model fragments. Thus, a repairedmodel differs from the original only by these replaced (repaired) fragments. If discrepanciesbetween an event log and a process model are local, then the change will be insignificant.It is especially important to preserve a model structure in the cases when the original modelhas been built by an expert, and therefore this model is well perceived by a human being. Themodel, re-discovered from scratch using only an event log, can be deprived of such an advantage.Besides, the presented technique is best suited when a set of activities is stable during the repair.The second chapter presents the main contribution of this thesis: the modular technique forprocess model repair.In the first section, we present the specific problem considered in the thesis.12Let ⊆ be a set of process activities. The initial process model is a workflow net =(, , , ) with transition labels taken from .
The net may have silent transitions labelled with . Thus, the labelling function is ∶ → ∪ { }.An event log is a multi-set of traces, and is defined in Section 1.1.3. Event names are takenfrom the same set of activities , i.e. ∈ ℬ(∗ ). Note that the event names of the log andthe transition labels of the model are from the set , i.e. () = . This means, that thereare no new or obsolete activities in the process, and the repair technique does not add/removetransitions to/from the model, except -labelled transitions.Besides, it is assumed that all transitions of a model except silent transitions have uniquelabels. That is, for a model = (, , , ), ∀1 , 2 ∈ : if (1 ) = (2 ) then 1 = 2 .
Note that thisassumption is needed to decompose the process model. In the worst case, if all transitions havethe same label, such a model can not be decomposed using decomposition algorithms which areconsidered in this thesis.The event log represents the normative (correct) behaviour that should be modelled, whenthe model does not perfectly fit this event log, i.e. (, ) < 1.Given a net and an event log , the problem is to automatically construct a Petri net(repaired model ) such that (, ) = 1.
Strictly speaking, perfectly fits . In otherwords, this repaired model can replay all traces of . This is a strict condition of success thatneeds to be satisfied.Besides, there are two soft conditions, which are aspired for but may not be satisfied.Firstly, the precise repaired model is desired. should not allow “too many” executionvariants which are not present in .
At least, the repaired model should be as precise as theinitial model , which means ( ) ≈ ( ). In other words, both of these modelsare approximately equally precise. Ideally, ( ) = ( ), but small fluctuationsare possible.Secondly, models and should have a similar structure. The less we change a model byrepair, the better this repair algorithm is. That is why a number of model elements touched bya repair procedure reflects the acceptability of a repair in Section 4.3, in which an experimentalevaluation of proposed repair technique will be described.Then, this chapter presents the main contribution of this thesis.
Firstly, it presents the modularscheme of the proposed model repair technique. Secondly, algorithms are presented implementingthe general modular scheme for particular repair cases.The modular repair scheme proposed in this thesis employs the idea of patching-up. Thealgorithm searches non-conforming model fragments, and replaces them with conforming ones.This technique is modular. In other words, the general scheme is constructed of several buildingblocks. Various particular algorithms can play this role. Thus, the general scheme can be refinedinto different repair algorithms by choosing appropriate algorithms as its actual proceduralparameters.13Each building block of the modular repair scheme executes one repair step.