0% found this document useful (0 votes)
228 views5 pages

Deep Learning & Survival Analysis

The document discusses using deep learning techniques like recurrent neural networks (RNNs) and long short-term memory (LSTM) networks for time series forecasting and survival analysis. RNNs can struggle with long sequences while LSTMs address this issue using gating mechanisms and memory cells. Survival analysis focuses on estimating the time until an event and allows for censored data. Common approaches include the Kaplan-Meier estimator and Cox proportional hazards regression model.

Uploaded by

Juan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
228 views5 pages

Deep Learning & Survival Analysis

The document discusses using deep learning techniques like recurrent neural networks (RNNs) and long short-term memory (LSTM) networks for time series forecasting and survival analysis. RNNs can struggle with long sequences while LSTMs address this issue using gating mechanisms and memory cells. Survival analysis focuses on estimating the time until an event and allows for censored data. Common approaches include the Kaplan-Meier estimator and Cox proportional hazards regression model.

Uploaded by

Juan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

www.coursera.

org
/learn/time-series-survival-analysis/supplement/T37OF/summary-review

Summary/Review | Coursera
6-7 minutes

Deep Learning for Time Series Forecasting


Neural networks offer several benefits over traditional time series forecasting models, including:

Automatically learn how to incorporate series characteristics like trend, seasonality, and
autocorrelation into predictions.

Able to capture very complex patterns.

Can simultaneously model many related series instead of treating each separately.

Some disadvantages of using Deep Learning for Time Series Forecasting are:

Models can be complex and computationally expensive to build (GPUs can help).

Deep Learning models often overfit.

It is challenging to explain / interpret predictions made by the model (“black box”).

Tend to perform best with large training datasets.

Recurrent neural networks (RNNs) map a sequence of inputs to predicted output(s).

Most common format is “many-to-one”, that maps an input sequence to one output value.

Input at each time step sequentially updates the RNN cell’s “hidden state” (“memory”).

After processing the input sequence, the hidden state information is used to predict the
output.

RNNs often struggle to process long input sequences. It is mathematically difficult for RNNs to
capture long-term dependencies over many time steps, which is a problem for Time Series, as
sequences are often hundreds of steps. Another type of Neural Networks, Long short-term
memory networks (LSTMs) can mitigate these issues with a better memory system

Long short-term memory networks share RNNs’ conceptual structure.


LSTM cells have the same role as RNN cells in sequential processing of the input sequence.

LSTM cells are internally more complex, with gating mechanisms and two states: a hidden
state and a cell state.

Long short-term memory networks regulate information flow and memory storage.

LSTM cells share forget, input, and output gates that control how memory states are updated
and information is passed forward.

At each time step, the input and current states determine the gate computations.

LSTMs vs RNNs

LSTMs are better suited for handling long-term dependencies than RNNs. However, they are
much more complex, requiring many more trainable weights. As a result, LSTMs tend to take
longer to train (slower backpropagation) and can be more prone to overfitting.

These are some guidelines on how to choose LSTMs or RNNs in a Forecasting task:

Always consider the problem at hand:

If sequences are many time steps long, an RNN may perform poorly.

If training time is an issue, using a LSTM may be too cumbersome.

Graphics processing units (GPUs) speed up all neural network training, but are especially
recommended when training LSTMs on large datasets.

Survival Analysis
Survival Analysis focuses on estimating the length of time until an event occurs. It is called
‘survival analysis’ because it was largely developed by medical researchers interested in
estimating the expected lifetime of different cohorts. Today, these methods are applied to many
types of events in the business domain.

Examples:

How long will a customer remains on books before churning

How long until equipment needs repairs

Survival Analysis is useful when we want to measure the risk of events occurring and our data
are Censored.
This can be referred to as failure time, event time, or survival time.

If our data are complete and unbiased, standard regression methods may work.

Survival Analysis allows us to consider cases with incomplete or censored data.

The Survival Function is defined as S(t) = P (T > t) . Itmeasures the probability that a
subject will survive past time t.

This function:

Is decreasing (non-increasing) over time.

Starts at 1 for all observations when t=0

Ends at 0 for a high-enough t

The Hazard Rate is defined as:

f (t)
h(t) =
S(t)

It represents the instantaneous rate at which events occur, given that it has not occurred already.

The cumulative hazard rate (sum of h(t) from t = 0 to t = t) represents accumulated risk over
time.

The Kaplan-Meier estimator is a non-parametric estimator. It allows us to use observed data to


estimate the survival distribution. The Kaplan-Meier Curve plots the cumulative probability of
survival beyond each given time period.
Using the Kaplan-Meier Curve allows us to visually inspect differences in survival rates by
category. We can use Kaplan-Meier Curves to examine whether there appear to be differences
based on this feature.

To see whether survival rates differ based on number of services, we estimate Kaplan-Meier
curves for different groups.

Survival Analysis Approaches


The Kaplan-Meier approach provides sample averages. However, we may want to make use of
individual-level data to predict survival rates.

Some well-known Survival models for estimating Hazard Rates include these Survival
Regression approaches. These methods:

Allow us to generate estimates of total risk as a function of time

Make use of censored and uncensored observations to predict hazard rates

Allow us to estimate feature effects

Although these methods use time, these methods are not generally predicting a time to an event,
rather predicting survival risk (or hazard risk) as a function of time.

–       The Cox Proportional Hazard (CPH) model

This is one of the most common survival models. It assumes features have a constant
proportional impact on the hazard rate.

For a single non-time-varying feature X, the hazard rate h(t) is modeled as:

β1X
h(t) = β0(t)e

β0(t) is the time-varying baseline hazard, and e


β1X
is the (constant) proportional adjustment
to the baseline hazard due to X.

Using the CPH model, we can plot estimated survival curves for various categories.

–       Accelerated Failure Time (AFT) models (several variants including


the Weibull AFT model)

These models differ with respect to assumptions they make about the hazard rate function, and
the impact of features.

You might also like