Change Point Detection
I E S 6 0 1 S E MI N AR
S A L BA N I CH A K RA BO RTTY (18I190011)
U N D E R T H E G U I D A NCE O F P RO F. N H E MA CH A N D RA
Contents
Background
Sample time series and change points
Classifications of change point detection algorithms and methods of detecting change points
Maximum Likelihood Estimation Of Change Point Location
Hypothesis test for the Change Point Problem – Single and Multiple Change point detection
Binary Segmentation Method in Multiple change point detection
CUSUM Procedure
Change Finder Method
Performance Measures of CPD Algorithms
References
Background
Data in reality may not often retain the same statistical properties is The Change point Problem
over time.
Detection of change points is useful in modelling and prediction of time series , medical
condition monitoring, climate change detection, speech and image analysis, and human
activity analysis.
The most commonly investigated changes in behaviour are:
Change in mean
Change in Variance
Change in Regression Model/parameter
Sample time series and change points
Classifications of Change Point
Detection
Change point detection algorithms are classified as “online” or “offline”.
The goal of off-line detection is generally to identify all of a sequence’s change points in
batch mode.
The goal of on-line detection is to detect change as soon as possible after it occurs, ideally
before the next data point arrives.
Both Parametric and Non-parametric methods are used in change point detection.
Basic Algorithms Used In CPD Problems
The techniques used in CPD include both supervised and unsupervised methods.
Supervised learning algorithms learn a mapping from input data to a target attribute of
the data, usually a class label.
This presentation will mostly focus on un-supervised methods of CPD.
Un-supervised learning algorithms are typically used to discover patterns in unlabeled
data.
Un-supervised methods includes
Likelihood Ratio Method
Change Finder Method
CUSUM Method
Maximum Likelihood Estimation Of The
Change Point Location
Given dataset 𝑥1 , 𝑥2 ,. . . . . . . . 𝑥𝑛 (Normally distributed) the estimated changepoint location 𝑘 is
k=arg k max 2≤k≤n−1 Vk
Vk = σnt=1 xt − μො 2 − σkt=1 xt − μ
ෞ1 2 + σnt=k+1 xt − μෞn 2
1 n 1 1
μො = σk=1 xk μ1 = σkt=1 xt , μෞn =
,ෞ σnt=k+1 xt
n k n−k
From this 𝐤መ value we get statistic 𝐔𝐤መ = 𝐕𝐤መ equivalent to the likelihood ratio test-statistic.
Hypothesis test for the Change Point
Problem
A natural approach to detecting a single changepoint is to perform an hypothesis test.
The hypotheses where there is a change in mean of the distribution at the point k is defined as
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 = ... = 𝝁𝒏 ; 𝑯𝟏 : 𝝁𝟏 = ... = 𝝁𝒌 ≠ 𝝁𝒌+𝟏 = ... = 𝝁𝒏
Here assumption is the data is normally distributed.
H0 is rejected when Uk > cα
𝑈𝑘 is the test statistic of k and 𝑐𝛼 is the critical value 𝛼 is a chosen significance level.
Likelihood Ratio Test For Single Change
We can view detecting a single changepoint as a hypothesis test
𝑯𝟎 : No changepoint, m = 0
𝑯𝟏 : A single changepoint, m = 1
One approach is to find τ, the position of change which maximises the log likelihood
𝟏 ) + 𝐥𝐨𝐠 𝐩(𝐲𝛕+𝟏:𝐧 |𝛉
L(τ) = 𝐥𝐨𝐠 𝐩(𝐲𝟏:𝛕 |𝛉 𝟐 )
Then, calculate the test statistic
]
𝛌 = 2[𝐦𝐚𝐱 𝛕 L(τ) - 𝐥𝐨𝐠 𝐩(𝐲𝟏:𝐧 |𝛉)
We then choose a threshold, c, such that we reject the null hypothesis if λ > c.
Multiple Change Point Detection
In practice the assumption of only one change may be unrealistic.
The search method for multiple change point aim to minimise,
σ𝒎+𝟏
𝒊=𝟏 [𝑪(𝒚(𝝉𝒊−𝟏+𝟏):𝝉 )] + 𝜷𝒇(𝒎)
𝒊
∁ is a cost function for a segment and 𝛽𝑓(𝑚) is a penalty to guard against over fitting.
∁ is negative log-likelihood and 𝛽𝑓(𝑚) may be 𝑐m.
An approximate method for minimizing the above is Binary Segmentation.
Binary Segmentation
Input: A time series of the form {𝑦1 , 𝑦2 , . . . . . . . . . , 𝑦𝑛 }
A test statistic Λ(.) dependent on the time series
An estimator of change point position 𝜏(.)
Ƹ
A rejection threshold C
Initialise: Let L = 𝜑, and S = {[1,n]}
Iterate: While S ≠ 𝜑
1. Choose an element of S; denote this element as [s,t].
2. If Λ(𝑦𝑠:𝑡 ) < C, remove [s,t] from S.
3. If Λ(𝑦𝑠:𝑡 ) ≥ C then:
a. remove [s,t] from S;
𝑠:𝑡 + s−1, and add r to L;
b. calculate r = 𝜏ෞ
c. if r ≠ s add [s,r] to S;
d. if r ≠ t−1 add [r + 1,t] to S;
Output: The set of change points recorded L.
Binary Segmentation: An Approximate
Method
The Binary Segmentation is computationally efficient.
It results in an 𝚶(𝐧 𝐥𝐨𝐠 𝐧) calculation.
However computational efficiency comes at the cost of exactness.
The location of a changepoint is conditional on the locations of previous changepoints.
The method does not search the entire solution space and is an approximation.
Exact Methods Of Detecting Change
Points
Exact methods detects all changepoints simultaneously using a goodness of fit measure, may be
by minimising
- σ𝒎
𝒊=𝟎 𝒍𝒐𝒈 𝒑( 𝒚𝝉𝒊 :𝝉𝒊+𝟏 |𝜽)
Can be inefficient, Ο(𝑄𝑛2 ), but efficiency can be improved through pruning such that Ο 𝑛 .
Exact methods include
Segment Neighbourhood Search
Pruned Exact Linear Time Method
CUSUM Procedure
It is method for detecting change in distribution of sequentially observed data.
At the 𝑘𝑡ℎ stage, the likelihood ratio test statistic is
𝐟𝟏 𝐗𝐢
𝐓𝐤 =max 𝐦𝐚𝐱 𝟏≤𝐣≤𝐤 σ𝐤𝐢=𝐣 𝐙𝐢 , 𝟎 where 𝐙𝐢 =𝐥𝐨𝐠
𝐟𝟎 𝐗𝐢
which is calculated by the recursive formula
+
𝐓𝐤 = 𝐓𝐤−𝟏 + 𝐙𝐤 , 𝐓𝟎 = 0
The Page-CUSUM procedure stops at 𝐍𝐩 =min 𝐤: 𝐓𝐤 ≥ 𝛄 where 𝛄 is prescribed boundary.
CUSUM Method In Statistical Quality
Control
The earliest techniques in this area are the Shewhart control charts.
These charts use only the current sample
Hence, they fail to use accumulated evidence of change.
In an attempt to remedy this CUSUM chart can be used.
The CUSUM-chart typically signals an out-of-control process by an upward or downward drift
of the cumulative sum until it crosses the boundary.
Change Finder Method
Change Finder is mainly used in time series analysis.
It reduces the problem of change point detection into time series-based outlier detection.
It fits an Auto Regression (AR) model onto the data and updates it’s parameter estimates
incrementally.
we can model the time series {𝑥𝑡 } using an AR mode of the 𝑘𝑡ℎ order by
𝒙𝒕 =𝝎𝒙𝒕−𝟏
𝒕−𝒌 + 𝝐
𝑥𝑡−𝑘 = (𝑥𝑡−1 , 𝑥𝑡−2 , . . . . . . . . ., 𝑥𝑡−𝑘 ) are previous observations, 𝜔 = (𝜔1 ,. . . . . ., 𝜔𝑘 ) are
constants, 𝜖 is normal random variable.
Change Finder Method
Updating model parameters, the probability density function 𝑝𝑡 at time t is calculated.
An auxiliary time-series {𝑦𝑡 } is generated by giving a score to each data point.
Score (𝑦𝑡 ) = 𝑑(𝑝𝑡−1 , 𝑝𝑡 )
𝑑 is any distance function.
In order to detect change points we need to know abrupt change in difference.
The change-point score is defined using score function.
A higher score indicates a higher possibility of a change point.
Performance Evaluation Of Change
Point Algorithms
It is very important to choose the appropriate algorithm for the change point detection.
Some of the useful performance metrics that we can employ to evaluate CPD algorithms are
Accuracy
Sensitivity
When difference in time between the detected CP and the actual CP represents the measure of
performance, then the measures used are
MAE
MSE
MSD
RMSE
Measures Of Comparison Between
Various Change Point Algorithms
Accuracy = 𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃
𝑇𝑃+𝑇𝑁
Sensitivity=𝑇𝑃+𝐹𝑁
𝑇𝑃
𝑇𝑃
Precision=
𝑇𝑃+𝐹𝑃
TP= True Positive, TN=True Negative, FN=False Negative, FP=False Positive
Classified as Classified as Non-
Change Point change point
True Change Point TP FN
True Non-change FP TN
Point
Other Methods Of Evaluation
Mean absolute error (MAE) = σ#𝐶𝑃
𝑖=1 |𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑃 −𝐴𝑐𝑡𝑢𝑎𝑙 𝐶𝑃 |
#𝐶𝑃
Mean squared error (MSE) = σ#𝐶𝑃
𝑖=1 (𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑃 −𝐴𝑐𝑡𝑢𝑎𝑙 𝐶𝑃 )2
#𝐶𝑃
Mean signed difference (MSD) = σ#𝐶𝑃
𝑖=1 (𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑃 −𝐴𝑐𝑡𝑢𝑎𝑙 𝐶𝑃 )
#𝐶𝑃
Root mean squared error(RMSE) = σ#𝐶𝑃 2
𝑖=1 (𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑃 −𝐴𝑐𝑡𝑢𝑎𝑙 𝐶𝑃 )
#𝐶𝑃
Conclusion
Change point detection is a very important statistical technique now a days.
It has vast application in statistical quality control, time series analysis and many more fields.
In this presentation, I presented various change point detection methods, analysed their
advantages and disadvantages.
Finding the change points as soon as possible is crucial, the detection delay for many existing
approaches is a problem.
Evaluating the significance of the detected change point is another important open issue for
unsupervised methods.
Although CPD algorithms have progressed significantly in the last decade, there are still many
open challenges.
References
Samaneh Aminikhanghahi and Diane J Cook. “A survey of methods for time series change
point detection”. In: Knowledge and information systems 51.2 (2017), pp. 339–367
PK Bhattacharya. “Some aspects of change-point analysis”. In: Lecture Notes-Monograph
Series (1994), pp. 28–56
Damien Garreau. “Change-point detection and kernel methods”. PhD thesis. PSL Research
University, 2017
Rebecca Killick et al. “Efficient detection of multiple changepoints within an oceano-graphic
time series”. In: Proceedings of the 58th world science congress of ISI. 2011.
Jie Chen and Arjun K Gupta. “On change point detection and estimation”. In: Communications
in statistics-simulation and computation30.3 (2001), pp. 665–697.
Thank You!