1 Introduction
2 Related Work
3 Bayesian Belief Network Construction
3.1 Bayesian Belief Network
Definition 1.
-
• Probability of evidence $P(X|e)$ can be used to answer questions such as “What’s the chance for insurance claim to be declined for someone aged 20–30 years old?”
-
• Most probable explanation $\mathit{MPE}(e)=\arg {\max _{x}}Pr(x,e)$ can be used to answer “What is the probability for process to end now given the current state?”
-
• Maximum a posteriori hypothesis $\mathit{MAT}(e,M)=\arg {\max _{x}}Pr(m,e)$ could be used for “What’s the most probable outcome of a claim check if the claimant is aged 23 years old and made the claim in Vilnius?”
3.2 Event Log
Table 1
Trace ID | Event | Timestamp | Attribute resource | Attribute claimant | Attribute status |
1 | Incoming_claim | 2014.01.05 08:05 | 1 | ||
1 | Register_claim | 2014.01.05 08:30 | A | 1 | |
1 | End | 2014.01.05 13:57 | B | 1 | Reject |
2 | Incoming_claim | 2014.01.07 13:07 | 2 | New client | |
2 | Register_claim | 2014.01.07 13:37 | A | 2 | |
2 | Initiate_payment | 2014.01.10 11:15 | B | 2 | Payed |
2 | End | 2014.01.10 11:17 | B | 2 | Closed |
Definition 2.
-
• E is a finite set of events,
-
• I is a finite set of traces,
-
• N is a finite set of attribute names,
-
• V is a value space of attributes,
-
• $M:N\times V$: is a finite set of attributes,
-
• $\mu :E\to M$ is a function assigning each event with attributes and their values,
-
• $\alpha :E\to A$ is a function assigning each event to an activity,
-
• $\gamma :E\to TD$ is a function assigning each event to a time stamp,
-
• $\beta :E\to C$ is a surjective function assigning each event to a case,
-
• $\mathit{name}:E\to N$ is a function identifying the name of an event and $\mathit{name}(ev)=v:(v\in V,n\in N:(v,n)\in \mu (e)\wedge n=\text{``}\mathit{concept}:\mathit{name}\text{''})$
-
• $>\subseteq E\times E$ is the succession relation, which imposes a direct ordering of the events in E,
-
• $\succ \subseteq {>^{+}}$ is the succession relation, which imposes a total ordering of the events in E.
3.3 State Transition System
Definition 3.
Definition 4.
Definition 5.
3.4 Event Log Transformation to Bayesian Belief Network
-
1. From the event log l state transition system ${T_{l}}$ is discovered;
-
2. A DAG G is extracted from the state transition system ${T_{l}}$. It is done by removing any state data in the state transition system. This leaves a Directed Acyclic Graph;
-
3. Conditional Probability Tables Θ are constructed;
-
4. The DAG G and CPTs Θ are combined into Bayesian belief network $(G,\Theta )$.
Definition 6.
-
1. Data attributes and their values ${M_{e}}$ from an event log are collected for each event e with identical partial traces σ;
-
2. For each event ${e_{\mathit{previous}}}$, data attributes and their values ${M_{\mathit{previous}}}$ from an event log are collected;
-
3. CPT ${\theta _{e}}$ is constructed where each row represents a unique set of attribute subsets $N\times V\in {M_{e}}\cup {M_{\mathit{previous}}}$ and each row has a probability of $\omega =\frac{\mathit{count}\_ \mathit{of}\_ \mathit{times}\_ \mathit{seen}}{\mathit{total}\_ \mathit{count}\_ \mathit{of}\_ \mathit{event}\_ \mathit{occurences}}$.
3.5 Business Process Inference
Definition 7.
Definition 8.
4 Evaluation of the Approach
4.1 Experiment Definition
Table 2
Log | Traces | Unique events | Total events | Attributes |
SL | 3512 | 9 | 20339 | 2–6 |
DL | 13087 | 36 | 262200 | 3–4 |
ML | 1156 | 289 | 59083 | 12 |
-
1. The event log is transformed into two subsets – 80% and 20%;
-
2. The 80% subset is used for discovering belief network;
-
3. Leftover set of the remaining 20% is used for experimental testing;
-
4. Average probability and standard deviation for the experiment is calculated;
-
5. Experiment is repeated for 4 more times with different subsets of event logs to create $k\text{-fold}=5$ results.
4.2 Experimental Environment
4.3 Experimental Results
Table 3
Log | Inferences taken into account | Total inferences/total events in the log | Events observed/events in the log | Precision |
SL | 13262 | 16811/20339 | 8/9 | $0.77\pm 0.26$ |
DL | 95097 | 147070/262200 | 33/36 | $0.52\pm 0.35$ |
ML | 11787 | 43435/59083 | 42/289 | $0.95\pm 0.16$ |
5 Conclusions
-
• The presented approach allows construction of Bayesian belief network from an event log;
-
• The approach, when used for decision support, provides, on average, 52% to 95% probabilities for actual data, proving that the approach can be used for inferences;
-
• The approach is dependent on data quantity in the event log and its’ expressiveness.