0% found this document useful (0 votes)
341 views6 pages

Business Statistics for Decision Making

This document discusses using business statistics for decision making by identifying target customers, collecting and analyzing data, and drawing conclusions. It then covers key statistical concepts like types of variables, frequency distributions, measures of central tendency and variation, probability, and methods for analyzing relationships between variables. Categorical variables can be analyzed using contingency tables and chi-square tests while quantitative variables use scatter plots and correlation. Lurking variables and Simpson's Paradox are noted as potential issues.

Uploaded by

Martin Lo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
341 views6 pages

Business Statistics for Decision Making

This document discusses using business statistics for decision making by identifying target customers, collecting and analyzing data, and drawing conclusions. It then covers key statistical concepts like types of variables, frequency distributions, measures of central tendency and variation, probability, and methods for analyzing relationships between variables. Categorical variables can be analyzed using contingency tables and chi-square tests while quantitative variables use scatter plots and correlation. Lurking variables and Simpson's Paradox are noted as potential issues.

Uploaded by

Martin Lo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

0

Use business stat for decision making:


1.) Identify the target customer
2.) Collect data
3.) Analysis
4.) Conclusion

Census: the process of collecting the population of all measurements


Sampling method:
Sample with replacement: the unit is place back
Sample without replacement
Random sample:
On that selection every unit has the same chance
Finite: fixed
Infinite: unlimited

1A
Types of Variable
1.) Categorical/ Qualitative
--can’t express with numbers --have gaps in charts
--Bar Chart, Pie Chart
E.g. what kind of dessert you like? Ice cream, mango pudding
a.) Nominal
--not ordered and no rank --Identifier or name
--Example: gender, car type
b.) Ordinal
--the order matters but not the difference between values
--Rank-order categories --Ranks are relative to each other
--Example: Low (1), moderate (2) or high (3) risk
2.) Numeric/Quantitative
--Can be express with numbers (quantity) --have NO gaps (have continual relationship)
--Histogram, Polygon, Scatter Plot
E.g. how many instant noodles the supermarket sold yesterday? 100, 200, 300
a.) Interval􏰃
--difference between two values is meaningful
--On a numerical scale with an arbitrary zero point / origin --Can add or subtract
--Example: Temperature (80 degree F is not twice as warm as 40 degree F)
The mutiplication means nothing 180-90=90=/ 180/2
b.) Ratio
--Measurements are on a numerical scale with a meaningful zero point􏰄
--Zero means “none” or “nothing”
--Values can be compared in terms of their interval and ratio
--$30 is $20 more than $10
--In business and finance, most quantitative variables are ratio variables
--Examples: Earnings, profit, loss, age, distance, height, weight
Frequency Distribution:
--A grouping of data into different categories
--Steps:
1.) decide on the number of classes
2.) Determine the class interval or width
3.) Set the individual class boundaries
4.) Count the number in each class
Relative Frequency:
--the proportion of items in each class
--Divide the frequency of that class by the total number of observation

Percent Frequency:
--Multiply relative frequency by 100
Bar Chart:
--a vertical or horizontal rectangle represents the frequency for each category
--can be frequency, relative frequency or percent frequency
Pie Chart:
--a circle where the size represent the relative or percent frequency
Frequency distribution table:
1.) Arrange raw data in ascending
2.) Find range
3.) Select number of classes
4.) Class intervals (Range/ number of classes)
5.) Determine class bounderies
6.) Compute class midpoint
Histogram:
--It can show:
--the highest and the lowest
--the distribution of the graph
--the concentration of the graph (the highest 2 bars)
--the sample size
--No gaps
--The larger the number of bins, the less obvious the graph (more evenly distributed)
Draw a Histogram:
--Choose the classes
--Choose origin
--Choose Bin width (the number of bins)
--compute the number of observations
--the height of the rectangle=relative frequency/percentage frequency
Frequency Polygon:
--Plot a point above each class midpoint at a height equal to the frequency, then connect the points
--Useful to compare two or more distributions
Scatter Plots:
--Study relationship between two variables (one on x-axis, one on y-axis)
--plot graph(1a34)
The Normal Curve: (1a35)
--bell-shaped --Symmetric
--Skewness:
--not symmetrical about the centre
--left/right skewed
1B
Population Parameter: number calculated from population measurements that describes some population
Sample statistic: a number calculated using sample measurement that describes some aspects of sample

Measures of Central Tendency:


-mean: will be affected by extreme values
-median: a value on the 50% of all measurements, not affected by extreme values
-mode:
--the value that occurs most frequently
--not affected by extreme values
--Two modes= bimodal -->2 modes= multimodal --{0,0,1,1,2,2}, {0,1,2,3,4)=no mode
Sample mean is a point estimate of population mean
If two distributions have the same mean, median and mode, they may not be the same, the distribution
may be different

Measures of Variation:
-Range: largest-smallest, sensitive to extreme values
-Interquartile range: 3rd-1st quartile, can eliminate outliers problems
-SD and Variance:
-Value far from the mean are given weight (squared)
-Less sensitive to extreme values than range, more sensitive than interquartile range
-Variance:
-same unit as SD
-Formulas
-Average of the squared deviations of individual measurements from the mean
--Population Variance
--Sample Variance
-Standard Deviation:
-The larger the SD, the more spread-out the data set
-Measure the risk of holding a security
-Coefficient of Variation (CV)
-measure the size of the SD relative to the size of mean
- (SD/Mean) x100%
-compare the relative variability of values about the mean
-compare two or more sets of data measured in different units
-Measure Risk
-The smaller the CV, the less the risk and return trade-off, the better the investment

Measure of relative location:


Z-scores (Standard Score)
--indicate the relative location of a value within a population
--If mean=0,SD=1, =>Standardized value
--Can be used to detect outlier (Z-score above 3/below -3)
Boxplot:
--show median, quartile, outlier, skew distribution
--when consider the skew distribution, ignore the outliers (dots outside the box)
When comparing groups:
--look for patterns, difference and trends
--can use histogram,
but boxplot is better as it offers better result for side-by-side comparison
---Compare the median, interquartile range (the variability) and the symmetry (shape)

1C
Association between quantitative variables:
Scatter Plot:
--Describe association
--1.) Direction: the trend
--2.) Curvature: linear or curved
--3.) Variation: Points tightly clustered?
--4.) Outliers

Measure of Association:
Covariance:
-a measure that quantifies the linear association
-formula (the total area)
-If >0 = on I / III (positive association/ linear relationship)
-If <0 = on II / IV (negative association)

Correlation (r):
-standardized measure of the strength of linear relationship
-formula
-no unit
-between 1 and -1
- 1 =perfect positive correlation
- 0 =no correlation
- -1 = perfect negative correlation
- If r>0, positive trend
If r<0, negative trend
- If lrl >0.75, strong relationship
If lrl <0.25, weak relationship
If 0.25< lrl < 0.75, moderate relationship
-corr(x,y)=corr(y,x)

How to find association:


-Plot a scatter plot
-Find the relationship
-Find the correlation to find the strength of relationship

Lurking variables:
-a variable that is not included in the explanation but will affect the apparent relationship between two
other variables

Correlation matrix:
-a table showing all correlations among a set of numeric variables
Association between categorical variables:

Contingency Table:
-marginal distribution: the subtotal of the two variables
-conditional distribution: the counts within a row and column
--If Associated,
--the column percentage will vary from column to column
--the row percentage will vary from row to row

Then find the artificial data (no association)

Chi-square Statistic:
--formula

Strength of association:
Cramer’s V:
-formula
-Ranges in value from 0 to 1
-no unit
-If V>0.75, strong association
-If V<0.25, weak association

Simpson’s Paradox:
-

2A
Concept of Probability:
Experiment: the observation of some activity
Outcome: Result of experiment
Sample space: the set of all possible experimental outcomes
Event: the collection of one or more outcomes of an experiment

Venn diagrams: graph showing the relationship among events


Union: A or B
Intersection: A and B
Mutually Exclusive Events:
-Do not overlap
-P(A and B)=0
-P(A or B)=P(A)+P(B)

Independent Events:
-P(A and B)=P(A)P(B)
See if the two independent or not
Probability Tree (Tree Diagram)

Collective Exhaustive events:


--Events that at least one of the events must occur
P(A or B or C or D)=1

Joint probability: two events P(Yes and MSN)


Marginal probability: margin (subtotal) P(MSN)

Conditional Probability:
-the probability of A, given that B has occurred
-P(AlB)=P(A and B)/P(B)

Complement Rule:
-let Ac be not A
-Given A and Ac are mutually exclusive and collectively exhaustive
-P(A)=1-P(Ac)

Addition rule:
P(AUB)=P(A)+P(B)-P(A and B) if A and B have joint probability
P(AUB)=P(A)+P(B) if A and B are mutually exclusive

If post several datas,


P(A l B)
P(A l B’)
P(B)
Method 1: contingency table
Method 2:Bayes Rule, formula

You might also like