Probabilistic Trace Alignment

ICPM 2021, Eindhoven
Probabilistic
Trace Alignment
Giacomo Bergami, Free University of Bozen-Bolzano

Fabrizio M. Maggi, Free University of Bozen-Bolzano

Marco Montali, Free University of Bozen-Bolzano
Rafael Peñaloza, University of Milano Bicocca

Conformance checking via alignments
Match log trace with the “closest” model trace(s) in terms of “moves with
disagreement”
close
accept
refuse
pay
archive
Order
close, archive

disagreement”
close
accept
refuse
pay
archive
Order
close, accept, pay, archive close, refuse, archive
close, archive

disagreement”
close
accept
refuse
pay
archive
Order
close, archive
close accept pay archive
close >> >> archive
close refuse archive
close >> archive
d=2 d=1

disagreement”
close
accept
refuse
pay
archive
Order
close, archive
close >> >> archive
close >> archive
d=2 d=1
Non
conforming
Closest
trace:
close,
refuse,
archive

What if we can “quantify” nondeterminism?
Process model describing stochastic behaviors: likely and unlikely to happen!
close
accept
refuse
pay
archive
Order
10%
90%

State of the art
close
accept
refuse
pay
archive
Order
10%
90%
Event

log
Measure of “stochastic
language distance”
!
in
fi
nite

Our approach: trace by trace…
close
accept
refuse
pay
archive
Order
10%
90%
close, archive

close
accept
refuse
pay
archive
Order
10%
90%
close >> >> archive
close >> archive
d=2 d=1
P=0.9 P=0.1
close, archive

close
accept
refuse
pay
archive
Order
10%
90%
close >> >> archive
close >> archive
d=2 d=1
conforming
?
Closest
trace?
P=0.9 P=0.1
close, archive

Log trace: close, archive
Probabilistic trace alignment
close
accept
refuse
pay
archive
Order
10%
90%

Perfectly
aligned
Terribly
aligned
The only
possible
Impossible
Likely
Unlikely
Model traces
close
accept
refuse
pay
archive
Order
10%
90%

Perfectly
aligned
Terribly
aligned
The only
possible
Impossible
Likely
Unlikely
Model traces
close
accept
refuse
pay
archive
Order
10%
90%
close, refuse, archive
close, accept, pay, archive

Contributions
• Probabilistic trace alignment as a kNN problem: ranked list
of top “k” model traces combining alignment distance and
probability

• Class of “well-behaved” stochastic Petri nets

• Pipeline to algorithmically attack the problem

• Realization of the pipeline with two ranking strategies: exact
and approximated

• Implementation and experimental evaluation

The model: a special type of…
Main tasks: (1) calculate the probability of a model trace
(2) retrieve all model traces exceeding a minimum probability
Stochastic Petri net with
repeated labels and silent
transitions
Main issue: a visible trace may have in
fi
nitely many
supporting runs (due to silent transitions)

transitions

transitions
Bounded

transitions
Bounded
Work
fl
ow
Well-established notion of what a trace/run is. Runs are
maximal. Adding elements in a run means “looping
more”, which decreases the probability.

transitions under “bounded silence”
Bounded
Work
fl
ow
Bounded silence: no loop goes only through silent steps
- at least one visibile transition is needed to indicate
iteration [good for modeling!]

transitions under “bounded silence”
Bounded
Work
fl
ow
Bounded silence: between two consecutive visible steps,
there can only be boundedly many invisible ones.

Good properties of the model
1. Execution semantics captured by a finite-state (stochastic)
reachability graph

2. Each trace has finitely many supporting runs

3. Trace probability computable by enumerating those runs and
summing up their probabilities

4. Only finitely many trace exceed a strictly positive probability
threshold

5. For a log trace, there is a maximal length over model traces
after which alignment distance and probability both decrease

Algorithmic pipeline
EXACT ALIGNMENT
For each log trace…

1. Invoke an exact aligner on all extracted model traces

2. Rank according to the probability-alignment distance combination

Algorithmic pipeline
APPROXIMATE ALIGNMENT via embeddings
1. Pre-embed all extracted model traces

2. For each log trace…

A. Embed the log trace

B. Rank by embedding distance

Experiments (sepsis data set)
E
ffi
ciency and suitability of approximation
Results:

• Many model traces —> exact approach struggles

• Long model traces —> approximation decreases in quality

• Current approximation based on bi-grams (embedding in pairs)

• N-grams improve approximation on long traces, but require more
prerocessing time for building the embedding space

Conclusions
• Stochastic conformance checking at the single trace level

• Suitable underlying formal model

• Algorithmic pipeline: approximated alignments based on well-
established techniques from databases and information retrieval

What’s next?
• Explore more the formal guarantees of the model

• Explore more about models (cf. declarative)

• Explore more the approximation spectrum

Probabilistic Trace Alignment

More Related Content

More from Faculty of Computer Science - Free University of Bozen-Bolzano (20)

Probabilistic Trace Alignment