Artificial Intelligence: Utility Theory Chapter 16, AIMA
Artificial Intelligence: Utility Theory Chapter 16, AIMA
Utility theory
Chapter 16, AIMA
The utility function U(S)
• An agent’s preferences between different
states S in the world are captured by the
Utility function U(S).
A ≻ B A is preferred to B
A ∼ B The agent is indifferent between A and B
The agent prefers A to B, or is indifferent
A≿B
between them.
L p1 , C1 ; p2 , C2 ; ; pn , Cn
The six axioms of utility theory
Orderability (A ≻ B) ∨ (B ≻ A) ∨ (A ∼ B) You must make a decision
Transitivity (A ≻ B) ∧ (B ≻ C) ⇒ (A ≻ C)
Continuity (A ≻ B ≻ C) ⇒ ∃p [p,A; 1-p,C] ∼ B
Substitutability (A ∼ B) ⇒ [p,A; 1-p,C] ∼ [p,B; 1-p,C]
Monotonicity (A ≻ B) ⇒ (p q ⇔ [p,A; 1-p,B] ≿ [q,A; 1-q,B])
Decomposability [p,A; 1-p,[q,B; 1-q,C]] ∼ [p,A; (1-p)q,B; (1-p)(1-q),C]
Toss Winning
H $2
TH $4
TTH $8
TTTH $16
... ...
The St. Petersburg ”paradox”
What is the expected winning in this betting
game?
N
2kN N
Winning N
P (k )W (k ) k 1 N
k 1 k 1 2 k 1
lim Winning N
N
1 N 1
N 1
2 ln(2) 1 N
2 2
lim Utility N 2 ln(2)
N
Mr Beard’s utility curve General ”human nature” utility curve
Risk averse
Risk seeking
Lottery game 1
0.01
$0
Lottery game 2
$5,000,000
0.1
D
0.11
$1,000,000
0.89
$0
Lottery preferences
• People should select A and D, or B and C.
Otherwise they are not being consistent...
V ( X 1 , X 2 ) V1 ( X 1 ) V2 ( X 2 )
Where V(X) is a value function (expressing the [monetary] value)
Example: The party problem
We are about to give a
wedding party. It will be
Rain Relieved
held during summer-
time.
Should we be outdoors or
In ¬Rain Regret
indoors?
The party is such that we
can’t change our minds Rain Disaster!
on the day of the party Out
(different locations for
indoors and outdoors). Perfect!
¬Rain
What is the rational
decision?
¬Rain Perfect!
U = 2.00
¬Rain Perfect!
U = 2.00
The change from outdoors to indoors
occurs at P(Rain) > 7/30
Utility function
represented by a
diamond.
The value of information
• The value of a given piece of information
is the difference in expected utility value
between best actions before and after
information is obtained.