Unit - 2
Unit - 2
Introduction
The General linear model is a unification of both linear and non-linear regression models that
also allows the incorporation of non-normal response distributions. In a GLM, the response
variable distribution must only be a member of exponential family, which includes the
normal, Poisson, binomial, exponential and gamma distributions as members. Furthermore
the normal error linear model is just a special case of GLM, so in many ways, the GLM can
be thought of as a unifying approach to many aspects of empirical modelling and data
analysis.
The General Linear Model (GLM) is a flexible framework used in statistics to model
relationships between dependent variables and one or more independent variables. Its
components typically include:
1. Dependent Variable (Y): This is the variable that you are trying to predict or explain.
In statistical notation, it is often denoted as Y.
2. Independent Variables (X): These are the explanatory variables that are
hypothesized to have an effect on the dependent variable. In statistical notation, these
are often denoted as X 1 , X 2 ,⋯ X p, where p is the number of independent variables.
3. Linear Combination: The relationship between the dependent variable and the
independent variables is expressed as a linear combination, often written as:
Y = β0 + β 1 X 1 + β 2 X 2 +⋯+ β p X p +ϵ
Where β 0 , β 1 ,⋯ , β p are the coefficients (parameters) that represent the effect of each
independent variable, and ϵ\epsilonϵ is the error term representing the variability in Y
that is not explained by the model.
4. Error Term (ε): This accounts for the variability in the dependent variable that is not
explained by the independent variables included in the model. It is assumed to be
normally distributed with mean 0 and constant variance.
5. Assumptions: The GLM typically assumes that the errors (ε) are independent,
normally distributed with constant variance (homoscedasticity), and that the model is
correctly specified (no omitted variables bias).
6. Estimation Method: The parameters (coefficients) in the GLM are often estimated
using least squares methods, maximum likelihood estimation, or other techniques
depending on the specific context and assumptions.
7. Hypothesis Testing: Inferences about the parameters (such as testing hypotheses
about their values) are often conducted using methods like t-tests, F-tests, or
likelihood ratio tests.
These components make up the basic structure of the General Linear Model, which is a
foundation for many statistical techniques, including linear regression, analysis of variance
(ANOVA), analysis of covariance (ANCOVA), and others
Applications:
The GLM is widely applicable across different fields and is foundational to various statistical
techniques, including:
The Binomial Logit Model is a specific type of Generalized Linear Model (GLM) used for
modelling binary outcomes, where the response variable Y takes on values of 0 or 1. It is
particularly useful when analysing situations where the outcome is binary, such as
success/failure, yes/no, or presence/absence.
−1
g( p)=Φ ( p )
The reason for the use of the probit model is that a constant scaling of the input
variable to a normal CDF (which can be absorbed through equivalent scaling of all of the
parameters) yields a function that is practically identical to the logit function, but probit
models are more tractable in some situations than logit models.