0% found this document useful (0 votes)

9 views19 pages

Binary Data

The document discusses the analysis of discrimination in mortgage applications using a linear regression model, focusing on how race may affect the likelihood of application denial. It highlights the importance of comparing denial rates between minority and white applicants while controlling for other factors, specifically using a binary dependent variable. The findings suggest that African-American applicants have a higher probability of denial compared to white applicants, but caution is advised as other influencing factors may exist.

Uploaded by

dunsscoto24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views19 pages

Binary Data

Uploaded by

dunsscoto24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

ANALYSING DATA WHEN THE OUTCOME

OF INTEREST IS BINARY (i)

Anna Conte – Applied Economics

Learning outcomes:
Discrimination in mortgage application; linear regression model

1/15
Preamble

• Two people, identical but for their race, walk into a bank and apply
for a mortgage, a large loan so that each can buy an identical house.
• Does the bank treat them the same way?
• Are they both equally likely to have their mortgage application
accepted?
• By law, they must receive identical treatment.
• But whether or not they do is a matter of great concern among
bank regulators.

2/15
Do banks discriminate?

• Loans are made and denied for many legitimate reasons.

• For example, if the proposed loan payments take up most or all of
the applicant’s monthly income, then a loan officer might justifiably
deny the loan.
• Also, even loan officers are human and they can make honest
mistakes, so the denial of a single minority applicant does not prove
anything about discrimination.
• Many studies of discrimination thus look for statistical evidence of
discrimination, that is, evidence contained in large data sets showing
that whites and minorities are treated differently.

3/15
How can we test for discrimination?

• How, precisely, should one check for statistical evidence of

discrimination in the mortgage market?

4/15
How can we test for discrimination?

• How, precisely, should one check for statistical evidence of

discrimination in the mortgage market?
• A start is to compare the fraction of minority and white applicants
who were denied a mortgage. How?

4/15
How can we test for discrimination?

• How, precisely, should one check for statistical evidence of

discrimination in the mortgage market?
• A start is to compare the fraction of minority and white applicants
who were denied a mortgage. How?

4/15
How can we test for discrimination?

• How, precisely, should one check for statistical evidence of

discrimination in the mortgage market?
• A start is to compare the fraction of minority and white applicants
who were denied a mortgage. How?

• But this comparison does not really answer the question of interest,
because the black and white applicants are not necessarily “identical
but for their race”.

4/15
How can we test for discrimination?

• How, precisely, should one check for statistical evidence of

discrimination in the mortgage market?
• A start is to compare the fraction of minority and white applicants
who were denied a mortgage. How?

• But this comparison does not really answer the question of interest,
because the black and white applicants are not necessarily “identical
but for their race”.
• Instead, we need a method for comparing rates of denial, holding
4/15
other applicant characteristics constant.
How do we deal with a binary dependent variable?

• This sounds like a job for multiple regression analysis—and it is, but
with a twist.
• The twist is that the dependent variable—whether or not the
applicant is denied—is binary.
• Using binary variables as regressors do not cause particular problems.
• But when the dependent variable is binary, things are more difficult:
what does it mean to fit a line to a dependent variable that can
take on only two values, zero and one?
• The answer to this question is to interpret the regression function as
a predicted probability.

5/15
Binary Dependent Variables and the Linear Regression Model

• The application examined in this chapter is whether race is a factor

in denying a mortgage application.
• The binary dependent variable is whether or not a mortgage
application is denied.
• The data set is simulated mimicking the “Boston HMDA data”,
data set compiled by researchers at the Federal Reserve Bank of
Boston under the Home Mortgage Disclosure Act (HMDA), and
relate to mortgage applications filed in the Boston, Massachusetts,
area.

6/15
Data description

Summary statistics and description of Boston HMDA data

Variable Description Mean
deny 1 if mortgage application denied, 0 otherwise 0.222
PIratio Ratio of total monthly debt payments to total monthly income 0.324
HIratio Ratio of monthly housing expenses to total monthly income 0.257
LVratio Ratio of size of loan to assessed value of property 0.735
SelfEmployed 1 if self-employed, 0 otherwise 0.113
single 1 if applicant reported being single, 0 otherwise 0.390
black 1 if applicant is black, 0 if white 0.142
HSdiploma 1 if applicant graduated from high school, 0 otherwise 0.947
Number of observations 2380

7/15
Relevant information

• To concede a loan, a loan officer must forecast whether or not the

applicant can afford his or her loan payments.
• One important piece of information is, for example, the size of the
required loan payments relative to the applicant’s income (it is much
easier to make payments that are 10% of your income than 50%!)
• We therefore begin by looking at the relationship between two
variables:
• the binary dependent variable ‘deny’, which equals one if the
mortgage application was denied and equals zero if it was
accepted;
• the continuous variable ‘PIratio’, which is the ratio of the
applicant’s anticipated total monthly loan payments to his or
her monthly income.

8/15
Scatterplot of Mortgage Application Denial and the Payment-
to-Income Ratio

The superimposed line represents the prediction of the linear regression

model of deny against PIratio. 9/15
The linear regression model

• The plot is not that clear, in that it does not show a clear pattern.
However, the superimposed line does.
• The line plots the predicted value of deny as a function of the
regressor, the payment-to-income ratio, using a linear regression
model.
• For example, when P/I ratio = 0.3, the predicted value of deny is
0.2. But what, precisely, does it mean for the predicted value of the
binary variable deny to be 0.2?
• The key to answering this question is to interpret the regression as
modelling the probability that the dependent variable equals one.
• Thus, the predicted value of 0.2 is interpreted as meaning that,
when P/I ratio is 0.3, the probability of denial is estimated to be
20%. Said differently, if there were many applications with PIratio
= 0.3, then 20% of them would be denied. 10/15
Application to the Boston HMDA data

• The OLS regression of the binary dependent variable, deny, against

the payment-to-income ratio, PIratio, estimated using all 2,380
observations in our data set is
d = − 0.211 + 1.335 × PIratio
deny (1)
(0.027) (0.078)

• The estimated coefficient on PIratio is positive and statistically

significantly different from zero at the 1% level (t-statistic=17.02).
Thus, applicants with higher debt payments as a fraction of income
are more likely to have their application denied.
• This coefficient can be used to compute the predicted change in the
probability of denial, given a change in the regressor. For example,
according to Equation (1), if the PIratio increases by 0.1, then the
probability of denial increases by 1.335 × 0.1 = 0.133, that is, by
13.3 percentage points.
11/15
Predicting denial probabilities via the linear regression model

• The estimated linear regression model can be used to compute

predicted denial probabilities as a function of the PIratio.
• For example, if projected debt payments are 30% of an applicant’s
income, then the PIratio is 0.3 and the predicted value from
Equation (1) is −0.211 + 1.335 × 0.3 = 0.189. That is, according to
this linear probability model, an applicant whose projected debt
payments are 30% of income has a probability of 18.9% that their
application will be denied.
• What is the effect of race on the probability of denial, holding
constant the PIratio?

12/15
The effect of race on the probability of denial

• To keep things simple, we focus on differences between black and

white applicants. To estimate the effect of race, holding constant
the PI ratio, we augment Equation (1) with a binary regressor that
equals one if the applicant is black and zero if white.
• The estimated linear regression model is
d = − 0.221 + 1.335 × PIratio + 0.073 × black
deny (2)
(0.027) (0.078) (0.023)

• The coefficient on black indicates that an African-American

applicant has a 7.3% higher probability of having a mortgage
application denied than a white, holding constant their PIratio. This
coefficient is significant at the 1% level (t-statistic=3.17).
• Taken literally, this estimate suggests that there might be racial bias
in mortgage decisions, but such a conclusion would be premature,
because there are many other factors that may influence the
decision. 13/15
Shortcomings of the linear regression model

• The linearity that makes the linear regression model easy to use is
also its major flaw.
• Looking again at the figure, we see that the estimated line
representing the predicted probabilities drops below zero for very
low values of the PIratio and exceeds one for high values!
• But this is nonsense: a probability cannot be less than zero or
greater than one.
• This nonsensical feature is an inevitable consequence of the linear
regression.
• To address this problem, we use nonlinear models specifically
designed for binary dependent variables, the probit and logit
regression models.

14/15
Associated files

• Data sets:
• “HDMA.dta”
• Do files:
• “STATA200303.do”

15/15

Lecture 6 LPM
No ratings yet
Lecture 6 LPM
14 pages
A Psycho-Pragmatic Study of Self-Identity of Efl Learners in Kurdistan Region
No ratings yet
A Psycho-Pragmatic Study of Self-Identity of Efl Learners in Kurdistan Region
179 pages
Ppt. Correlation and Regression
No ratings yet
Ppt. Correlation and Regression
33 pages
1170_10045_121363
No ratings yet
1170_10045_121363
77 pages
James R. Evans - Statistics, Data Analysis and Decision Modeling International 5th Ed.-Pearson (2013)
86% (14)
James R. Evans - Statistics, Data Analysis and Decision Modeling International 5th Ed.-Pearson (2013)
543 pages
Ecmetrics II Ch1
No ratings yet
Ecmetrics II Ch1
56 pages
CH 5 Limited Dependent Variable Models Jan 2023
No ratings yet
CH 5 Limited Dependent Variable Models Jan 2023
43 pages
BBA_MBAIntegratedwef2024_25_200125
No ratings yet
BBA_MBAIntegratedwef2024_25_200125
38 pages
Unit 3 Simple Correlation and Regression Analysis1
No ratings yet
Unit 3 Simple Correlation and Regression Analysis1
16 pages
Econ Shu301 CH11
No ratings yet
Econ Shu301 CH11
53 pages
2.simple Regression Analysis Chapter 6
No ratings yet
2.simple Regression Analysis Chapter 6
27 pages
Data Science Syllabus For Ohnours and Minor Degree
No ratings yet
Data Science Syllabus For Ohnours and Minor Degree
16 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Simple Linear Regression Analysis 1
No ratings yet
Simple Linear Regression Analysis 1
23 pages
Sociology: Intermediate Quantitative Research Method
No ratings yet
Sociology: Intermediate Quantitative Research Method
35 pages
VisCAP 2
No ratings yet
VisCAP 2
19 pages
Bussiness Statistics Book
No ratings yet
Bussiness Statistics Book
5 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
R Project #5 (Logistic Regression) (1) 2
No ratings yet
R Project #5 (Logistic Regression) (1) 2
13 pages
Stock_Watson_3U_ExerciseSolutions_Chapter11_Instructors
No ratings yet
Stock_Watson_3U_ExerciseSolutions_Chapter11_Instructors
12 pages
Regression: Rashid Mehmood M.Phil. (Education) 2 Semester
No ratings yet
Regression: Rashid Mehmood M.Phil. (Education) 2 Semester
22 pages
Lending Club Data Analysis and Default
No ratings yet
Lending Club Data Analysis and Default
10 pages
Assignment5_24510001 (3)
No ratings yet
Assignment5_24510001 (3)
9 pages
Levenbach Causal2017
No ratings yet
Levenbach Causal2017
15 pages
Simple Linear Regression Analysis - ReliaWiki
No ratings yet
Simple Linear Regression Analysis - ReliaWiki
29 pages
Linear Regression and Logit
No ratings yet
Linear Regression and Logit
15 pages
CH 5 2023 Eonometrics For Acct and Finance
No ratings yet
CH 5 2023 Eonometrics For Acct and Finance
6 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
10 pages
Problem Set 7
No ratings yet
Problem Set 7
5 pages
STAT3301 - Term Exam 2 - CH11 Study Package
No ratings yet
STAT3301 - Term Exam 2 - CH11 Study Package
6 pages
Assignment3 05.01.24
No ratings yet
Assignment3 05.01.24
4 pages
Lecture 8 - Limited Dependent Var PDF
No ratings yet
Lecture 8 - Limited Dependent Var PDF
78 pages
Econometrics Practical 2
No ratings yet
Econometrics Practical 2
2 pages
Prediction of New Observation
No ratings yet
Prediction of New Observation
13 pages
AE_project
No ratings yet
AE_project
9 pages
Omitted Variable Bias: The Simple Case
No ratings yet
Omitted Variable Bias: The Simple Case
8 pages
Txtassign Soln6 PDF
No ratings yet
Txtassign Soln6 PDF
2 pages
Amr Assignment 2: Logistic Regression On Credit Risk
No ratings yet
Amr Assignment 2: Logistic Regression On Credit Risk
6 pages
Chapter 7 - Quantitative Analysis
100% (1)
Chapter 7 - Quantitative Analysis
13 pages
Regression Analysis: Terminology and Notation: The PRF (Population Regression Function)
No ratings yet
Regression Analysis: Terminology and Notation: The PRF (Population Regression Function)
25 pages
Assumptions in Linear Regression
No ratings yet
Assumptions in Linear Regression
3 pages
R M I B R C: Eversion To The Ean in Ncome and Ias in Acial Oefficients
No ratings yet
R M I B R C: Eversion To The Ean in Ncome and Ias in Acial Oefficients
24 pages
Ben Cen 212 2017 - Final
No ratings yet
Ben Cen 212 2017 - Final
8 pages
07 Chapter 2
No ratings yet
07 Chapter 2
25 pages
Assignment-2: Submitted By: Name: Vipul Kumar Singh Roll No: 133118 Submitted To: Prof. Kuldeep Baishya
No ratings yet
Assignment-2: Submitted By: Name: Vipul Kumar Singh Roll No: 133118 Submitted To: Prof. Kuldeep Baishya
4 pages
MATH 231-Statistics-Hira Nadeem PDF
No ratings yet
MATH 231-Statistics-Hira Nadeem PDF
3 pages
CHapter 5 Acct
No ratings yet
CHapter 5 Acct
8 pages
Lbymet Pset1
No ratings yet
Lbymet Pset1
1 page
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
2 pages
Qtmfinalpresentationpaper
No ratings yet
Qtmfinalpresentationpaper
19 pages
Correlation and Regression Analysis PDF
No ratings yet
Correlation and Regression Analysis PDF
11 pages
LSSGB (Simplilearn, 2014) - Lesson - 4. Analyze
No ratings yet
LSSGB (Simplilearn, 2014) - Lesson - 4. Analyze
121 pages
OM Test Bank - Chapte3
100% (2)
OM Test Bank - Chapte3
10 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
What Consumers Need to Know About Mortgages: What Consumers Need to Know, #1
From Everand
What Consumers Need to Know About Mortgages: What Consumers Need to Know, #1
Dan Melson
No ratings yet
Passive Income: How to Give up Your Day Job and Put Your Feet Up
From Everand
Passive Income: How to Give up Your Day Job and Put Your Feet Up
Ross Perry
No ratings yet
Make Friends with Credit: understand and build a lasting relationship with good credit
From Everand
Make Friends with Credit: understand and build a lasting relationship with good credit
Amber Rose
No ratings yet
The War on Credit: A Combat Veteran's Guide to Credit Repair
From Everand
The War on Credit: A Combat Veteran's Guide to Credit Repair
Ronald Anthony
No ratings yet
Less Than an NSF Fee: The ABCs of Financial Literacy
From Everand
Less Than an NSF Fee: The ABCs of Financial Literacy
LaToya D. Cheek
No ratings yet
Understanding the Mortgage Process
From Everand
Understanding the Mortgage Process
Lauraine Madison
No ratings yet
Good Mortgage Advice: The Home Buyer's Guide to Financing a Home - A Crash Course for Confidence
From Everand
Good Mortgage Advice: The Home Buyer's Guide to Financing a Home - A Crash Course for Confidence
M. D. Baltazar
5/5 (1)
Credit-Lit Credit 101 for Teens
From Everand
Credit-Lit Credit 101 for Teens
Dionne Perry
No ratings yet
Credit Secrets : The Complete Guide on How to Boost Your Credit Score 100+ Points Without Credit Repair: Improve Your Financial Life, Enjoy Freedom and Independence
From Everand
Credit Secrets : The Complete Guide on How to Boost Your Credit Score 100+ Points Without Credit Repair: Improve Your Financial Life, Enjoy Freedom and Independence
Dan Cardone
No ratings yet
Simple Credit Repair
From Everand
Simple Credit Repair
LadyG
5/5 (1)
Free Credit Score: How to get your REAL FICO Credit Score for Free
From Everand
Free Credit Score: How to get your REAL FICO Credit Score for Free
Kent Greenfields
5/5 (4)
Improve and Increase Your Credit Score: Credit Management Strategies that Will Save You Thousands
From Everand
Improve and Increase Your Credit Score: Credit Management Strategies that Will Save You Thousands
Jason R. Rich
No ratings yet
How To Repair Your Own Credit
From Everand
How To Repair Your Own Credit
Sharon Barlow
No ratings yet
Dirty Little Secrets: What the Credit Reporting Agencies Won't Tell You
From Everand
Dirty Little Secrets: What the Credit Reporting Agencies Won't Tell You
Jason R. Rich
4/5 (4)
From Start to Finish: The Guide to Your Own Personal Credit Repair
From Everand
From Start to Finish: The Guide to Your Own Personal Credit Repair
Advanced Credit Solutions
No ratings yet
The D-I-Y Loan Modification Special Report
From Everand
The D-I-Y Loan Modification Special Report
Jason Hartman
No ratings yet
Repair and Boost Your Credit Score in 30 Days: How Anyone Can Fix, Repair, and Boost Their Credit Ratings in Less Than 30 Days
From Everand
Repair and Boost Your Credit Score in 30 Days: How Anyone Can Fix, Repair, and Boost Their Credit Ratings in Less Than 30 Days
Quincy Lesley Darren
No ratings yet
Amazing Credit Repair: Boost Your Credit Score, Use Loopholes (Section 609), and Overcome Credit Card Debt Forever
From Everand
Amazing Credit Repair: Boost Your Credit Score, Use Loopholes (Section 609), and Overcome Credit Card Debt Forever
Abel Gray
No ratings yet
How to Increase or Build Your Credit Score in One Month: Add Over 100 Points Without The Need of Credit Repair Services
From Everand
How to Increase or Build Your Credit Score in One Month: Add Over 100 Points Without The Need of Credit Repair Services
John Knight
3/5 (1)
Repair Your Credit Score: The Ultimate Personal Finance Guide. Learn Effective Credit Repair Strategies, Fix Bad Debt and Improve Your Score.
From Everand
Repair Your Credit Score: The Ultimate Personal Finance Guide. Learn Effective Credit Repair Strategies, Fix Bad Debt and Improve Your Score.
Carl Williams
No ratings yet
Golden Nuggets
From Everand
Golden Nuggets
Will Shepard
No ratings yet
From Bad to Good Credit: A Practical Guide for Individuals with Charge-Offs and Collections
From Everand
From Bad to Good Credit: A Practical Guide for Individuals with Charge-Offs and Collections
Clyde N. Cook, III
No ratings yet
Guide to DIY Credit Repair: Beginner's Guide
From Everand
Guide to DIY Credit Repair: Beginner's Guide
Christy Mobley
No ratings yet
How to dispute properly and get paid to fix your own credit: The only credit repair guide you will ever need
From Everand
How to dispute properly and get paid to fix your own credit: The only credit repair guide you will ever need
Lawrence Hicks
No ratings yet
Credit Repair Secrets Learn Proven Steps To Fix And Boost Your Credit Score To 100 Points in 30 days Or Less
From Everand
Credit Repair Secrets Learn Proven Steps To Fix And Boost Your Credit Score To 100 Points in 30 days Or Less
Jake Robbins
No ratings yet
Boost Your Credit Score In 30 Days: Credit Repair Blueprint
From Everand
Boost Your Credit Score In 30 Days: Credit Repair Blueprint
Dana Robinson
No ratings yet
How to Increase Your Credit Score Yourself!
From Everand
How to Increase Your Credit Score Yourself!
Raw Fortune
No ratings yet
Credit Write Up
From Everand
Credit Write Up
Scriber Virtuals, Inc.
5/5 (1)
Think Successfully: The Ultimate Credit Repair Guide
From Everand
Think Successfully: The Ultimate Credit Repair Guide
Quartavian Bonner
No ratings yet
Dont Gamble With Your Credit Worthiness
From Everand
Dont Gamble With Your Credit Worthiness
Will Roundtree
No ratings yet
Make Your Credit Great Again
From Everand
Make Your Credit Great Again
Ron James
5/5 (5)
Credit Freeze and Data Repair Strategies
From Everand
Credit Freeze and Data Repair Strategies
Credit Report Relief
5/5 (2)
Beating The Bureaus
From Everand
Beating The Bureaus
La'Teefia High
5/5 (2)
All You Need to Know About Payday Loans
From Everand
All You Need to Know About Payday Loans
Linda Carol Everett
5/5 (1)

Binary Data

Uploaded by

Binary Data

Uploaded by

ANALYSING DATA WHEN THE OUTCOME

OF INTEREST IS BINARY (i)

Anna Conte – Applied Economics

• Loans are made and denied for many legitimate reasons.

• How, precisely, should one check for statistical evidence of

• How, precisely, should one check for statistical evidence of

• How, precisely, should one check for statistical evidence of

• How, precisely, should one check for statistical evidence of

• How, precisely, should one check for statistical evidence of

• The application examined in this chapter is whether race is a factor

Summary statistics and description of Boston HMDA data

• To concede a loan, a loan officer must forecast whether or not the

The superimposed line represents the prediction of the linear regression

• The OLS regression of the binary dependent variable, deny, against

• The estimated coefficient on PIratio is positive and statistically

• The estimated linear regression model can be used to compute

• To keep things simple, we focus on differences between black and

• The coefficient on black indicates that an African-American

You might also like