0% found this document useful (0 votes)

28 views41 pages

Reinforcement Learning 2

Uploaded by

Tùng Đào

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views41 pages

Reinforcement Learning 2

Uploaded by

Tùng Đào

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Machine Learning

Reinforcement Learning
(2)

Dr. Harry Goldingay

[email protected]
Learning Outcomes

At the end of this lecture you should:

Understand the difference between active and passive

reinforcement learning and understand the role of
exploration in the former.

Understand two active reinforcement learning algorithms –

SARSA and Q-learning – well enough to implement them.

Be aware of other important approaches to reinforcement

learning.
Motivation
Aims of Reinforcement Learning

In the previous unit, we introduced passive reinforcement

learning…
Given a policy, how do we evaluate it?

…but when talking about MDPs more generally, we did better:

we could find optimal policies.

Can we do the same in a reinforcement learning setting?

Scenario 1

Our agent has a policy.

It acts in the environment following this policy.

It uses the information it observes to learn information about the

utility of the environmental states.

Based on this information, it calculates the optimal policy.

Any issues with this?

Scenario 2

Our agent has no policy – instead it acts randomly.

It acts in the environment.

It uses the information it observes to learn information about the

utility of the environmental states.

Based on this information, it calculates the optimal policy.

Any issues with this?

Which Slot Machine To Play?
How to Act in Grid World?
Active
Reinforcement
Learning
Greedy Algorithm

We were able to use the Bellman equations…

𝑈 𝑠 = 𝑅 𝑠 + 𝛾 max ෍ 𝑃 𝑠 ′ 𝑠, 𝑎 𝑈(𝑠 ′ )
𝑎∈𝐴(𝑠)
𝑠′
…to find the optimal policy 𝜋 ∗ from the utilities of states 𝑈 𝑠 .

𝜋 ∗ 𝑠 = argmax ෍ 𝑃 𝑠 ′ 𝑠, 𝑎 𝑈(𝑠 ′ )
𝑎∈𝐴(𝑠)
𝑠′
Recap: Grid World: Utilities of States

For 𝛾 = 1
Recap: Policies and Utilities of States

If we know 𝑈 𝑠 for each state, we can use it to infer the optimal

policy.
Choose the action which results in the highest expected
utility of the next state.

What action should an agent take in state (3,1) in the previous

example?
Up: 0.8 × 0.66 + 0.1 × 0.655 + 0.388 ≈ 0.63
Left: 0.8 × 0.655 + 0.1 × 0.66 + 0.611 ≈ 0.65
The agent should choose Left.

Key takeaway: 𝜋 ∗ can be inferred directly from utility.

Greedy Agent

We were able to use the Bellman equations…

𝑈 𝑠 = 𝑅 𝑠 + 𝛾 max ෍ 𝑃 𝑠 ′ 𝑠, 𝑎 𝑈(𝑠 ′ )
𝑎∈𝐴(𝑠)
𝑠′
…to find the optimal policy 𝜋 ∗ from the utilities of states 𝑈 𝑠 .

A greedy agent acts in the environment to estimate the required

quantities…
𝑈 𝑠
𝑅 𝑠
𝑃 𝑠 ′ 𝑠, 𝑎
… and then at each step chooses the action which maximises utility in
its current state.

Choice of action makes this an example of active reinforcement learning.

Greedy Agent: Multi-armed Bandit
Greedy Agent: Grid World
Greedy Agent: Grid World Policy
Exploration

We have seen in the previous examples that the greedy

algorithm is flawed.

Taking the current estimated optimal action restricts data

gathering – may mean that the optimal action is not
discovered…

…and that, even when the action from the optimal policy is
taken, it may not be optimal in the context of the rest of the
agent’s policy.

Greedy agents exploit their knowledge – we need an agent

which also explores.
𝜖-greedy Algorithm

We have already talked about an excellent way of exploring –

acting randomly in all states!

Can we get the advantages of the greedy and the random

approach?

Simple solution: 𝝐-greedy algorithm:

Act randomly with probability 𝜖 (some parameter of the
algorithm)
Otherwise act greedily

This algorithm makes the trade-off between exploration and

exploitation clear.
Balancing Exploration and Exploitation

Recall that, in the context of the multi-armed bandit problem, we said that:
As we get information, we don’t want to keep seemingly taking bad
options frequently…
…but a finite number of trials is never enough to be certain about the
result of a stochastic process – we are never done with exploration.

In theory we want our algorithms to be greedy in the limit with infinite

exploration (GLIE).
Infinite exploration: they should take all actions in all states an
unbounded number of times.
Greedy in the limit: approaches greedy behaviour arbitrarily closely as
time increases.
Simple example 𝝐𝒕 -greedy: agent acts randomly a fraction 1/𝑡.
Exploration Function

Our two previous algorithms explore, but don’t prioritise their exploration.
We can encourage exploration of unvisited states by optimistically
estimating their utility.

Consider an active version of the previously discussed value iteration algorithm

in which an agent acts greedily at each step based on optimistic estimated utility
𝑈 + 𝑠 which is updated as follows:

𝑈 + 𝑠 = 𝑅 𝑠 + 𝛾 max 𝑓(෍ 𝑃 𝑠 ′ 𝑠, 𝑎 𝑈 + 𝑠 ′ , 𝑁(𝑠, 𝑎))

𝑎
𝑠′
Where:
𝑅+, 𝑖𝑓 𝑛 < 𝑁𝑒
𝑓 𝑢, 𝑛 = ቊ
𝑢, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

And where 𝑁(𝑠, 𝑎) is the count of times the agent has taken action 𝑎 in state 𝑠
and 𝑁𝑒 is a parameter of the algorithm.
Grid World: Exploration Function
SARSA and
Q-learning
TDL For Active Reinforcement Learning

Recall the update equation for TDL:

𝑈 𝑠 = 𝑈 𝑠 + 𝛼 𝑁𝑠 𝑠 𝑅 𝑠 + 𝛾𝑈 𝑠 ′ − 𝑈 𝑠

And our criterion for choosing an action based on the bellman

equations:

𝜋 ∗ 𝑠 = argmax ෍ 𝑃 𝑠 ′ 𝑠, 𝑎 𝑈(𝑠 ′ )
𝑎∈𝐴(𝑠)
𝑠′

What issues does this imply in an active reinforcement learning

setting?
The passive version of TDL does not give us enough information to
choose an action!
We would need also to know/learn the transition model 𝑃 𝑠 ′ 𝑠, 𝑎
Q-values

Can we adapt the ideas of TDL to give us a model-free approach to

active reinforcement learning?

One approach is not to learn the utilities of states 𝑈(𝑠), but instead to
learn Q-values 𝑄(𝑠, 𝑎) – the expected utility of taking action 𝑎 in state 𝑠.

Note that, given Q-values it is easy to recover state utilities:

𝑈 𝑠 = max 𝑄(𝑎, 𝑠)
𝑎
and to greedily choose actions:

𝜋 ∗ 𝑠 = argmax 𝑄(𝑎, 𝑠)
𝑎
SARSA

How to learn Q-values? How to act on them?

In SARSA (State-Action-Reward-State-Action) an agent has some

policy 𝜋 based on 𝑄 (e.g. 𝜖-greedy).

In state 𝑠 it takes action 𝑎 resulting in reward 𝑅(𝑠) and new state 𝑠′.

In state 𝑠′ it takes action 𝑎′.

It updates Q-values as follows:

𝑄 𝑠, 𝑎 = 𝑄 𝑠, 𝑎 + 𝛼 𝑅 𝑠 + 𝛾𝑄 𝑠 ′ , 𝑎′ − 𝑄 𝑠, 𝑎

The updated Q-values may cause the policy to change.

Q-Learning

Q-learning is a related approach.

An agent has some policy 𝜋 based on 𝑄 (e.g. 𝜖-greedy).

In state 𝑠 it takes action 𝑎 resulting in reward 𝑅(𝑠) and new state 𝑠′.

It updates Q-values as follows:

𝑄 𝑠, 𝑎 = 𝑄 𝑠, 𝑎 + 𝛼 𝑅 𝑠 + 𝛾 max
′
𝑄(𝑠 ′ , 𝑎′ ) − 𝑄 𝑠, 𝑎
𝑎

The updated Q-values may cause the policy to change.

TDL vs SARSA vs Q-learning

Compare the update equations.

Passive Temporal Difference Learning:

𝑈 𝑠 = 𝑈 𝑠 + 𝛼 𝑁𝑠 𝑠 𝑅 𝑠 + 𝛾𝑈 𝑠 ′ − 𝑈 𝑠
Expected utility
based on action
SARSA: from policy
𝑄 𝑠, 𝑎 = 𝑄 𝑠, 𝑎 + 𝛼 𝑅 𝑠 + 𝛾𝑄 𝑠 ′ , 𝑎′ − 𝑄 𝑠, 𝑎

Q-learning:
𝑄 𝑠, 𝑎 = 𝑄 𝑠, 𝑎 + 𝛼 𝑅 𝑠 + 𝛾 max 𝑄(𝑠 ′ , 𝑎 ′ ) − 𝑄 𝑠, 𝑎
′ 𝑎

Expected utility based

on greedy policy
SARSA and Q-Learning Compared

−106

?
? ? ?
SARSA and Q-Learning Compared

If we knew the true Q-values, then behaving greedily would be

optimal…
Q-learning tries to learn this optimal policy regardless of the actual
policy the agent is using.
It is an off-policy learning algorithm.

…however, if an agent is not following a greedy policy (e.g. because it

is exploring) then basing actions on Q-values for a greedy algorithm
can cause the agent to act suboptimally.
SARSA instead tries to learn the optimal Q-values for its current
policy.
It is an on-policy learning algorithm.

…and can cause issues with convergence.

SARSA and Q-learning Illustrated

𝑎1 𝟏. 𝟎

𝑠0
𝑠1
𝟏. 𝟎
𝑎0

𝟎. 𝟓
𝟎. 𝟏
𝑎0 𝑎1
𝟎. 𝟓
𝟎. 𝟗
𝑅 𝑠0 = −0.50
𝑠2
𝑅 𝑠1 = −0.75
𝑅 𝑠2 = −0.10
SARSA and Q-learning Illustrated

Imagine that we have run one of SARSA or Q-learning with an 𝜖-greedy policy
and obtained the following estimates of Q-values

𝑄(𝒂, 𝒔) 𝑠0 𝑠𝟏
𝑎0 -0.8 -1.35
𝑎1 -0.7 -0.85

What is the greedy policy?

Take action 𝑎1 in state 𝑠0 and action 𝑎1 in state 𝑠1 .

Let us assume that the agent starts in state 𝑠0 and chooses to take action 𝑎1 ,
receiving reward -0.5 and transitioning to state 𝑠1 .

For the purposes of the SARSA update, we will assume that, in this state, it
chooses action 𝑎0 (exploration).
SARSA and Q-learning Illustrated

Sequence: 𝑠0 , 𝑎1 −0.5 → {𝑠1 , 𝑎0 }

The agent has taken action 𝑎1 in state 𝑠0 so must update
𝑄(𝑠0 , 𝑎1 ).
We will set 𝛼 = 0.1 and 𝛾 = 0.75
Action chosen
in state 𝑠1
SARSA:
𝑄 𝑠0 , 𝑎1 = 𝑄 𝑠0 , 𝑎1 + 𝛼 𝑅 𝑠0 + 𝛾𝑄 𝑠1 , 𝑎0 − 𝑄 𝑠0 , 𝑎1
𝑄 𝑠0 , 𝑎1 = −0.7 + 0.1 −0.5 − 0.75 × 1.35 + 0.7 = −0.78

Best action in state 𝑠1

Q-learning:
𝑄 𝑠0 , 𝑎1 = 𝑄 𝑠0 , 𝑎1 + 𝛼 𝑅 𝑠0 + 𝛾 max 𝑄(𝑠 , 𝑎 ′) − 𝑄 𝑠 , 𝑎
′ 1 0 1
𝑎
𝑄 𝑠, 𝑎 = −0.7 + 0.1 −0.5 − 0.75 × 0.85 + 0.7 = −0.74
Generalisation
Scaling Up
Scaling Up

The approaches we have discussed so far depend on

estimating quantities per state…
𝑄(𝑠, 𝑎)

…or even quadratic in the numbers of states.

𝑃(𝑠’|𝑠, 𝑎)

We need (multiple) samples for each state.

Chess has been estimated to have 1040 board states.

We need some way to generalise from data we have gathered

about states we have visited.
Genralisation – The Problem

Assume that we want to generalise Q-values, then:

We have a series of state-action pairs

[ 𝜎1 , 𝛼1 , 𝜎2 , 𝛼2 , … , 𝜎𝑁 , 𝛼𝑁 ]

For each state-action pair we have a sample of the

corresponding Q-value
[𝑄 𝜎1 , 𝛼1 , 𝑄 𝜎2 , 𝛼2 , … , 𝑄 𝜎𝑁 , 𝛼𝑁 ]

We want to predict the Q-value function for unseen state-

action pairs:
𝑄(𝜎, 𝛼)
Function Approximation

The problem we have just described is just a supervised

learning problem.

We choose some model to approximate our Q-value function

𝑄෠𝜽 (𝑠, 𝑎)

We want to update our parameters 𝜽 to minimise some error

We will use sum-of-squares

We can use on-line learning to update the model parameters

after each trial:
Other Approaches
Dynamic Programming

We have focussed primarily on model-free approaches to

reinforcement learning (temporal difference approaches)…

…but model-based approaches, such as adaptive dynamic

programming are also successful:
Learn the model by acting in the environment
Use offline approaches (e.g. value iteration) to infer optimal policy.

These approaches often converge faster than model-free

approaches…

…but don’t scale as well.

Policy Search

We have introduced ways of constructing policies from

parameters (e.g. Q-values).

We can estimate the quality of a policy by running trials of an

agent with that policy in the environment and measuring
observed reward.

We want to maximise the quality of the policy.

This is an optimization problem! Can apply tools such as

gradient descent.
Conclusion

You should now:

Understand the difference between active and passive

reinforcement learning and understand the role of
exploration in the former.

Understand two active reinforcement learning algorithms –

SARSA and Q-learning – well enough to implement them.

Be aware of other important approaches to reinforcement

learning.

RLbook Solutions Manual
No ratings yet
RLbook Solutions Manual
35 pages
Exercises 111
No ratings yet
Exercises 111
19 pages
EE 675 Lecture 27th March
No ratings yet
EE 675 Lecture 27th March
4 pages
Lec 17 SARSA Expected SARSA Q Learning
No ratings yet
Lec 17 SARSA Expected SARSA Q Learning
4 pages
Q Learning SARSA Deep Q Learning
No ratings yet
Q Learning SARSA Deep Q Learning
4 pages
p1 Piotr
No ratings yet
p1 Piotr
7 pages
Report p1
No ratings yet
Report p1
7 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
Mod3 Slides
No ratings yet
Mod3 Slides
199 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
Unit 1
No ratings yet
Unit 1
18 pages
Reinforcement Learning: Instructor: Max Welling
No ratings yet
Reinforcement Learning: Instructor: Max Welling
18 pages
cs188 sp23 Note14
No ratings yet
cs188 sp23 Note14
2 pages
Reinforcement Learning: Russell and Norvig: CH 21
No ratings yet
Reinforcement Learning: Russell and Norvig: CH 21
16 pages
Reinforcement Learning: Russell and Norvig: CH 21
No ratings yet
Reinforcement Learning: Russell and Norvig: CH 21
16 pages
Intro To Reinforcement Learning
No ratings yet
Intro To Reinforcement Learning
56 pages
rl-unit5
No ratings yet
rl-unit5
101 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
ReinforcementLearning
No ratings yet
ReinforcementLearning
17 pages
lecture-06
No ratings yet
lecture-06
98 pages
37 RL
No ratings yet
37 RL
18 pages
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
No ratings yet
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
30 pages
unit 3 ai
No ratings yet
unit 3 ai
5 pages
10. Learning Task
No ratings yet
10. Learning Task
14 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
12 pages
Machine_Learning_Chapter 4
No ratings yet
Machine_Learning_Chapter 4
13 pages
Temporal Difference Learning
No ratings yet
Temporal Difference Learning
17 pages
Rule-based Reinforcement Learning augmented by External Knowledge
No ratings yet
Rule-based Reinforcement Learning augmented by External Knowledge
7 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
Mid Term Report SoS (3)
No ratings yet
Mid Term Report SoS (3)
18 pages
F20-AI-L11
No ratings yet
F20-AI-L11
52 pages
Reinforcement Learning: A Short Cut
No ratings yet
Reinforcement Learning: A Short Cut
7 pages
12 ML Reinforcement Learning Value Based Control
No ratings yet
12 ML Reinforcement Learning Value Based Control
12 pages
Lec17-ReinforcementLearning
No ratings yet
Lec17-ReinforcementLearning
58 pages
Reinforcement-Learning-Cheatsheet
No ratings yet
Reinforcement-Learning-Cheatsheet
16 pages
Lec 04 Reinforcement Learning
No ratings yet
Lec 04 Reinforcement Learning
57 pages
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
No ratings yet
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
9 pages
Some Thoughts On Reinforcement Learning: 1 Motivation
No ratings yet
Some Thoughts On Reinforcement Learning: 1 Motivation
9 pages
16 - Reinforcement Learning and Bandits.pptx
No ratings yet
16 - Reinforcement Learning and Bandits.pptx
41 pages
RL-Endterm Report - Mridul Agarwal
No ratings yet
RL-Endterm Report - Mridul Agarwal
27 pages
ML Unit 5 (ChatGPT)
No ratings yet
ML Unit 5 (ChatGPT)
17 pages
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
No ratings yet
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
5 pages
UNIT-5
No ratings yet
UNIT-5
54 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
19 - Monte Carlo and Temporal Difference for Markov Decision Processes.pptx
No ratings yet
19 - Monte Carlo and Temporal Difference for Markov Decision Processes.pptx
57 pages
Lecture 1: Introduction: Lecturer: Prof. Subrahmanya Swamy Peruru Scribe: Harshvardhan Arya - Rishabh Katiyar
No ratings yet
Lecture 1: Introduction: Lecturer: Prof. Subrahmanya Swamy Peruru Scribe: Harshvardhan Arya - Rishabh Katiyar
4 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Unit-5 Mlt
No ratings yet
Unit-5 Mlt
13 pages
10 Deep Reinforcement
No ratings yet
10 Deep Reinforcement
40 pages
Sections
No ratings yet
Sections
76 pages
Lec 09
No ratings yet
Lec 09
26 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
unit-5
No ratings yet
unit-5
65 pages
21 - Reinforcement Learning
No ratings yet
21 - Reinforcement Learning
25 pages
Filippov Theory On Infinitesimal Epsilon-Greedy Q-Learning
No ratings yet
Filippov Theory On Infinitesimal Epsilon-Greedy Q-Learning
66 pages
Intro to Reinforcement Learning - DQ Q AC A3C
No ratings yet
Intro to Reinforcement Learning - DQ Q AC A3C
36 pages
RL Concepts and Methods
No ratings yet
RL Concepts and Methods
8 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
136 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
9 pages
MAS-Lab7-QFA
No ratings yet
MAS-Lab7-QFA
10 pages
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bitcoin Pizza Day 03
No ratings yet
Bitcoin Pizza Day 03
6 pages
ZT CLUB Dialogue With The Voxel X Network CEO Tim Analysis of Voxel
No ratings yet
ZT CLUB Dialogue With The Voxel X Network CEO Tim Analysis of Voxel
9 pages
Module Introduction - P1 - 23 - 24
No ratings yet
Module Introduction - P1 - 23 - 24
10 pages
The New York City AI Plan
No ratings yet
The New York City AI Plan
51 pages
Northwestern Engineering Graduate Program Guide
No ratings yet
Northwestern Engineering Graduate Program Guide
21 pages
KardiaChain Company Profile
No ratings yet
KardiaChain Company Profile
25 pages
Math 362, Problem Set 2
No ratings yet
Math 362, Problem Set 2
3 pages
Session 11-15: Dr. Anup Kumar
No ratings yet
Session 11-15: Dr. Anup Kumar
108 pages
BITS Pilani: MATH F113-Probability and Statistics Assignment 1 (Updated)
No ratings yet
BITS Pilani: MATH F113-Probability and Statistics Assignment 1 (Updated)
10 pages
Tabel SPSS Tugas Akhir Teknik Sipil-Universitas Trisakti
No ratings yet
Tabel SPSS Tugas Akhir Teknik Sipil-Universitas Trisakti
7 pages
954 Math T (PPU - STPM) Semester 3 Topics-Syllabus
67% (3)
954 Math T (PPU - STPM) Semester 3 Topics-Syllabus
5 pages
Ans Normal
No ratings yet
Ans Normal
9 pages
Submitted To: Mohammed Mosleh-Uddin School and Business Economics Subject: Business Statistics Course Code: BUS 511
No ratings yet
Submitted To: Mohammed Mosleh-Uddin School and Business Economics Subject: Business Statistics Course Code: BUS 511
6 pages
Introduction To Probability: Deterministic Models
No ratings yet
Introduction To Probability: Deterministic Models
4 pages
Rec 8A - Discrete Random Variables ALL
No ratings yet
Rec 8A - Discrete Random Variables ALL
6 pages
Formulae Card Endterm ECB1STAT
No ratings yet
Formulae Card Endterm ECB1STAT
2 pages
STSM1614_2025 Module guide
No ratings yet
STSM1614_2025 Module guide
9 pages
ACFrOgB2jhtC8EOVUjUpbw10Q9167kFokCFb778-mzK8wtMpJGzl-O2y2pVy4_lLgm9Fn4IarTphDj1zby97fsO4lf2-UHCQMv8dwzRwWqADNdcfpU3hLlm29shcUxx6Vd0jITyXTD_BLilDsfoWmmq1DcxOZhUZF3N5sNg6BQ==
No ratings yet
ACFrOgB2jhtC8EOVUjUpbw10Q9167kFokCFb778-mzK8wtMpJGzl-O2y2pVy4_lLgm9Fn4IarTphDj1zby97fsO4lf2-UHCQMv8dwzRwWqADNdcfpU3hLlm29shcUxx6Vd0jITyXTD_BLilDsfoWmmq1DcxOZhUZF3N5sNg6BQ==
15 pages
Stochastic Differential Equations
100% (1)
Stochastic Differential Equations
105 pages
Probability and Statistics: Dr.-Ing. Erwin Sitompul President University
No ratings yet
Probability and Statistics: Dr.-Ing. Erwin Sitompul President University
15 pages
Statistics Year 1 Chapter 5 Booklet
No ratings yet
Statistics Year 1 Chapter 5 Booklet
7 pages
Kwan 2006
No ratings yet
Kwan 2006
21 pages
Constant Expected Return
No ratings yet
Constant Expected Return
35 pages
Topic 3
No ratings yet
Topic 3
42 pages
CH 6 Flood Hydrology
No ratings yet
CH 6 Flood Hydrology
45 pages
1 - Prelimimaries Probablity Statistics
No ratings yet
1 - Prelimimaries Probablity Statistics
35 pages
Cars Arriving For Gasoline at A Shell Station Foll...
No ratings yet
Cars Arriving For Gasoline at A Shell Station Foll...
5 pages
Don Muella Case Study III Ken Black Business Statistics
No ratings yet
Don Muella Case Study III Ken Black Business Statistics
4 pages
HSS-CP.A.1 Describing Events
No ratings yet
HSS-CP.A.1 Describing Events
4 pages
Untitled Document
No ratings yet
Untitled Document
3 pages
UMBayesAdaptIntro SM2
No ratings yet
UMBayesAdaptIntro SM2
64 pages
Min Max
No ratings yet
Min Max
2 pages
Variograms
92% (13)
Variograms
20 pages
Final Exam of Statistics 2020
No ratings yet
Final Exam of Statistics 2020
2 pages
T4-VRM-3-Ch3-Volatility-v3_Study Notes
No ratings yet
T4-VRM-3-Ch3-Volatility-v3_Study Notes
38 pages