0% found this document useful (0 votes)
60 views17 pages

Statistical Data by Group 1 - Statistic Economics 2

Course: Statistics Economic 2 International Class of Management TA.2023/2024 Lambung Mangkurat University Under Guidance by Dr. Ir. Syahrial Shaddiq, M.Eng., M.M. Topic: Statistical Data Presented by Group 1 Member of Group: 1. Akhmad Faisal Rifki (2210312310019) 2. Muhammad Albani Andika Putra (2210312310004)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views17 pages

Statistical Data by Group 1 - Statistic Economics 2

Course: Statistics Economic 2 International Class of Management TA.2023/2024 Lambung Mangkurat University Under Guidance by Dr. Ir. Syahrial Shaddiq, M.Eng., M.M. Topic: Statistical Data Presented by Group 1 Member of Group: 1. Akhmad Faisal Rifki (2210312310019) 2. Muhammad Albani Andika Putra (2210312310004)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

PAPER

STATISTIC ECONOMICS II
“STATISTICAL DATA”
Under Guidance:
Dr. Ir. Syahrial Shaddig, M. Eng., M.M.

Compiled By:
Akhmad Faisal Rifki (2210312310019)
Muhammad Albani Andika Putra (2210312310004)

INTERNATIONAL CLASS
BACHELOR’S DEGREE PROGRAM IN MANAGEMENT
FACULTY OF ECONOMICS AND BUSINESS
LAMBUNG MANGKURAT UNIVERSITY
BANJARMASIN
2024
INTRODUCTION
a. Background
Statistical data is an extremely important component in a wide variety of
fields and practical applications, ranging from the formulation of public policy
to the decision-making process in the commercial world. A skill that is
extremely valuable in today's information age is the capacity to collect, process,
and analyze the vast amounts of data that are generated on a daily basis. This is
because the amount of data that is generated is enormous. The use of statistical
data provides a firm platform for accurate analysis and dependable predictions,
which in turn enables individuals and organizations to make decisions that are
more precise and based on evidence.
In the context of business, the exploitation of statistical data provides
assistance to organizations in comprehending the behavior of consumers,
optimizing operations, and improving marketing tactics. Companies are able to
spot trends in the market, monitor the performance of their products, and
evaluate the efficacy of their promotional activities that are carried out using
statistical analysis. In addition to this, statistical data is also an essential
component in the process of risk management and the establishment of new
products.
The public sector makes use of statistical data in order to develop
policies that are more efficient and responsive to the needs of the public. For
instance, information pertaining to public health can be of assistance to the
government in the process of establishing healthcare intervention programs that
are specifically targeted, while demographic and economic statistics are utilized
for the construction of infrastructure and the planning of public services. A
significant contribution that statistical data makes to the field of education is that
it makes it easier to evaluate educational programs and to enhance the quality of
instruction.
In addition, the significance of having a comprehensive comprehension
of statistical data is not something that is considered to be exclusive to experts
working in the field of statistics. In the realm of academia, research in a variety
of subjects, including as engineering, the natural sciences, and the social
sciences, significantly rely on statistical data analysis to give conclusions that
are both useful and accurate. Knowledge of statistical methods is also an
essential qualification in a variety of different fields, particularly those that rely
on information derived from statistics in order to improve the efficiency and
efficacy of their decision-making processes.
As a result, having a strong command of statistical data and the capacity
to effectively use it becomes an essential component in the process of generating
success and innovation in a variety of sectors. To prepare individuals and
businesses to meet more complex and data-driven issues and possibilities in the
future, it is essential to have proper education and training in statistics. This
highlights the necessity of adequately preparing individuals and companies.

b. Problem
The problem discussed in this paper relates to the background of
statistical data, as follows:
1. Which basic statistical concepts are necessary to comprehend when doing
data analysis?
2. Which techniques work best for gathering data for statistical analysis?
3. How should statistical data be presented to allow for the easiest possible
comprehension and interpretation?
4. How should I assess data centering in a dataset using measures like mean,
median, and mode?
5. Which techniques are suitable for examining data distributions and finding
patterns within them?
6. How can I measure the data distribution to obtain precise information about
the data variability?
7. How can the relationship between variables in a dataset be understood using
correlation and regression analysis techniques?
8. How is the procedure for testing hypotheses carried out to guarantee the
reliability of statistical results?

c. Objectives
The study objectives are derived from the outcomes of the previously formulated
problem.
1. The objective is to comprehend and elucidate the essential elements of
statistics that are necessary for data analysis.
2. The objective is to identify and assess the most efficient techniques for
gathering data for statistical analysis.
3. To ascertain the optimal method of presenting statistical data in a manner
that facilitates comprehension and interpretation.
4. To assess the alignment of data, such as the mean, median, and mode, within
a dataset in order to obtain a precise representation of the data.
5. To examine the dispersion of data and detect prevailing distribution trends.
6. The purpose is to assess the data's distribution and acquire precise insights
into its variability.
7. The objective is to utilize correlation and regression analysis techniques to
gain insight into the relationship between variables in a given dataset.
8. To do hypothesis testing and determine the accuracy of statistical results.
RESULT AND DISCUSSION
Statistical data is a piece of information that is used in a variety of fields,
including research and practical applications. It serves as a dashboard for analysis that is
both comprehensive and based on facts. The fundamental aspects of statistics include an
understanding of key concepts such as population, sample, variable, and parameter, all
of which are essential for developing a foundation for data analysis that is of a high
quality.
Each individual possesses a unique set of advantages and disadvantages that
need to be included into the context of the specific research that is being conducted.
These advantages and disadvantages can be derived via surveys, experiments, and even
silent observations. The presentation of data, whether in the form of a table, a graph, or
a diagram, serves the purpose of facilitating interpretation and making it easier to
identify patterns and train with less effort.
A. Fundamental Aspect of Statistics
Statistics is a branch of mathematics that involves the organized procedures of
collecting, analyzing, interpreting, and presenting data (Moore et al., 2017). It offers
an extensive framework of tools and concepts specifically designed to reveal and
comprehend patterns and trends that arise from data. Data can be classified into two
primary categories: qualitative, which refers to descriptive information, and
quantitative, which involves numerical data. These categories can be further divided
based on measurement scales such as nominal, ordinal, interval, or ratio (Field,
2013).
The importance of these categories is in their impact on the kinds of analyses
that can be conducted. For example, nominal data, which categorizes data into
various categories without any inherent order, is valuable for identifying and
differentiating between groups or types. Conversely, ordinal data establishes a
hierarchy across categories, allowing for the identification of relative positions
without indicating the exact variations in magnitude between them. Interval and
ratio data, which support complex mathematical operations like addition,
subtraction, multiplication, and division, facilitate advanced statistical analysis that
can uncover profound insights into data trends and correlations.
Central tendency metrics, such as the mean, median, and mode, are essential
tools for describing data distributions (Salkind, 2017). The mean, which is obtained
by calculating the arithmetic average of a set of values, serves as a measure of the
central tendency of a distribution. However, it can be influenced by outliers. The
median, which determines the middle value in a ranking data set, provides a strong
measure of central tendency that is less influenced by extreme values. The mode,
which represents the value that occurs most frequently, provides information about
the most prevalent category or value in a dataset.
The selection of the suitable measure of central tendency relies on the distinct
attributes of the data distribution and the goals of the research. For instance, the
mean is commonly employed for data that conforms to a normal distribution, where
values are evenly dispersed about the central point in a symmetrical manner.
Conversely, the median is the recommended measure for distributions that are
skewed, meaning the data may have a disproportionately lengthy tail on one side.
The mode is especially valuable for categorical data as it offers a clear indication of
the category that appears most frequently.

B. Methods of Collecting Information


Common data collection methods encompass surveys, experiments, and
observations (Fowler, 2013). Each of these methods possesses unique attributes and
uses that render it appropriate for various types of research inquiries and goals.
Surveys are a flexible and extensively employed technique for gathering data
from a subset of the population. They can be implemented using several methods,
including interviews, surveys, or direct observation. Surveys are designed to collect
data on the attitudes, behaviors, or attributes of respondents. They can be organized
using closed-ended questions for quantitative analysis or open-ended questions for
qualitative insights. Online surveys and in-person interviews are two often used
methods of survey administration (Couper et al., 2015). Online surveys have
benefits in terms of rapidity, cost-efficiency, and the capacity to reach a wide-
ranging audience. Nevertheless, these studies may experience decreased response
rates and potential biases as a result of participants' self-selection. On the other
hand, face-to-face interviews, albeit requiring more resources, can provide more
comprehensive and detailed information due to the human interaction and the
opportunity to delve further into responses.
Experiments entail the deliberate alteration of independent factors in order to
observe and analyze their impact on dependent variables. This strategy is especially
effective for identifying causal links between variables. Researchers can deduce the
cause-and-effect dynamics by manipulating extraneous elements and deliberately
altering the independent variable in a methodical manner. Experiments can be
carried out either in controlled laboratory settings or in natural field environments,
each of which provides unique benefits. Laboratory experiments offer a high level
of control over variables and situations, which improves internal validity. On the
other hand, field experiments promote external validity by studying phenomena in
natural settings.
Observation, as a method of data collecting, involves the direct monitoring of
people' behavior without any intervention in order to gain information. This
approach is highly beneficial for recording authentic behaviors and settings, offering
insights that may be overlooked by more invasive methods. Observational studies
can be categorized as either structured or unstructured. Structured observations
adhere to a predetermined protocol, whereas unstructured observations are more
adaptable and open-ended. The primary advantage of observational methods is their
capacity to offer an authentic representation of participants' behavior in real-life
scenarios.
It is crucial to guarantee the accuracy and consistency of data when collecting it
(Bryman, 2016). Validity pertains to the degree to which a measurement device
precisely measures the specified target. It includes several types, such as content
validity, construct validity, and criterion-related validity. Content validity ensures
that the instrument comprehensively encompasses all pertinent components of the
topic being assessed. Construct validity evaluates the extent to which an instrument
accurately measures the intended theoretical construct. Criterion-related validity
assesses the degree of association between the instrument being evaluated and an
external criterion.
Reliability refers to the enduring and consistent nature of the measurement
equipment over a period of time. A dependable device produces consistent outcomes
when used in stable conditions. Reliability can be evaluated using techniques such
as test-retest reliability, inter-rater reliability, and internal consistency. Test-retest
reliability assesses the consistency of the instrument over time by administering the
identical test to the same subjects at different time intervals. Inter-rater reliability
assesses the degree of consistency in measurements made by various observers.
Internal consistency evaluates the degree to which the items in a test measure the
identical construct.

C. The Presentation of Data


Tables, bar charts, histograms, and pie charts are commonly acknowledged and
employed techniques for presenting data (Everitt & Hothorn, 2011). Each of these
visualization techniques has distinct functions and provides distinct advantages in
efficiently delivering information to the audience.
Tables are very adaptable instruments for arranging and displaying data in a
methodical and organized fashion. They are especially valuable when there is a
requirement to provide accurate numerical values or when comparisons between
different categories or groups are necessary. Tables offer a thorough and inclusive
representation of data, enabling users to effortlessly find specific information and
recognize patterns or trends.
Bar charts and pie charts are frequently used to represent categorical data, which
involves organizing data points into separate categories or groupings (Kabacoff,
2015). Bar charts visually display data by utilizing rectangular bars of different
lengths. Each bar represents a specific category, and its height is related to the
frequency or magnitude of the accompanying data point. Bar charts are a useful tool
for comparing the relative magnitudes or occurrences of several categories, making
them well-suited for representing categorical variables with distinct values. In
contrast, pie charts visually display data in a circular pattern that is divided into
slices, where each slice corresponds to a specific proportion or percentage of the
entire dataset. Pie charts are highly effective for visually representing the allocation
of data across various categories and emphasizing the comparative ratios of each
category within the overall dataset. They offer a distinct visual depiction of how
each individual component contributes to the overall structure of the dataset.
Histograms are specifically designed to represent numerical data and display the
frequency distribution of a continuous variable, distinguishing them from bar and
pie charts (Kabacoff, 2015). Histograms are visual representations of data, where
bars are used to represent different ranges or intervals of values. The height of each
bar corresponds to the frequency or count of data points that occur within that
particular range. Histograms are particularly useful for elucidating the distribution,
mean, and dispersion of a dataset, rendering them indispensable instruments for
doing exploratory data analysis and hypothesis testing in statistical analysis.

D. Evaluation of the Data Centering


Rumsey (2017) states that central tendency measures, such as the mean, median,
and mode, are essential tools for summarizing and comprehending the distribution
of data. The mean, also known as the arithmetic average, is determined by adding up
all the values in the dataset and then dividing by the total number of observations. It
offers a measure of the central tendency that is affected by each value in the dataset.
Nevertheless, the mean is highly susceptible to extreme values or outliers, which can
greatly distort its value and impact its representativeness of the entire sample.
Conversely, the median is the central value of a dataset whether organized in either
ascending or descending order. Contrary to the mean, the median is less influenced
by outliers, making it a resilient measure of central tendency, particularly in datasets
with skewed distributions or extreme values. The median, by determining the
middle value, provides a more distinct portrayal of the typical value in the sample,
unaltered by exceptional observations. The mode, however, reflects the value that
appears most frequently in the dataset. It is especially beneficial for categorical data
or datasets where determining the most prevalent value is of significance. The mode
offers a deeper understanding of the most common category or value in the
collection, emphasizing noteworthy patterns or trends.
The mean, median, and mode provide information about the central tendency of
a dataset. However, their appropriateness depends on the distribution of the data and
the existence of outliers. The mean is useful for datasets that follow a normal
distribution and have few outliers, but it should be used with caution when outliers
are present. The median is the preferable measure of central tendency in
distributions that are skewed or when outliers are a problem, as it provides a more
robust estimation. Meanwhile, the mode is advantageous for categorical data or
when the main goal is to determine the most commonly occurring value.
Researchers can make informed selections regarding the best suitable central
tendency measure for their investigation by comprehending the strengths and limits
of each measure (Navidi, 2015).

E. Evaluation of Data Distribution Analysis


As stated by McClave et al. (2019), the range is a fundamental measure of
variability that quantifies the difference between the highest and lowest values in a
dataset. It provides a clear indicator of the spread of values within the dataset.
Although the range is easy to calculate, it provides only a restricted comprehension
of the dataset's spread and does not consider the distribution of values inside the
range. However, it functions as a preliminary measure in evaluating the dataset's
variability, providing a comprehensive picture of its scope.
Going beyond the scope, statistical metrics like variance and standard deviation
offer a more detailed understanding of how data is distributed around the average
value. Variance, as defined by Triola (2018), measures the average amount by which
each data point deviates from the mean, squared. Variance accentuates departures
from the mean by squaring them, thereby maintaining their magnitude.
Nevertheless, variance is measured in units that have been squared, which makes it
more difficult to immediately evaluate in relation to the original data.
According to Gravetter and Wallnau (2013), the standard deviation, which is
calculated by taking the square root of the variance, provides a more understandable
measure of variability. The standard deviation is obtained by calculating the square
root of the variance. This ensures that the standard deviation is measured in the
same units as the original data, making it more convenient for comparison and
understanding purposes. The metric provides an indication of the average distance
between data points and the mean, giving a brief assessment of how spread out the
dataset is while still being easy to understand.
Furthermore, the usefulness of the standard deviation goes beyond its simplicity
in understanding. According to Gravetter and Wallnau (2013), the reason for the
extensive usage of statistical analysis is its strong and dependable ability to
accurately represent the range of data. The standard deviation is a useful measure of
variability that takes into consideration the distance of each data point from the
mean and gives equal weight to these deviations. It is particularly helpful when
comparing datasets that have various scales or units of measurement.

F. Measurement Data Distribution


Gelman et al. (2014) state that the normal distribution, commonly referred to as
the Gaussian distribution, is a widely present and well researched statistical
distribution. The predominance of this phenomenon arises from its exceptional
characteristics, which include a symmetrical bell-shaped distribution and the fact
that the mean, median, and mode are all identical. The symmetrical nature of this
form makes it an effective model for illustrating the distribution of continuous
variables in several natural phenomena, including human height, IQ scores, and
measurement mistakes. Moreover, the Central Limit Theorem asserts that as the
sample size increases, the distribution of sample means from any population,
irrespective of its underlying distribution, tends to converge towards a normal
distribution. This emphasizes the significance of the normal distribution in statistical
theory and practice.
Agresti (2007) states that the binomial distribution is a crucial probability
distribution commonly used in statistics, especially when dealing with binary or
categorical data that has just two possible outcomes, generally referred to as
"success" and "failure." The two parameters that define it are the probability of
success (represented by p) and the number of tries (represented by n). The binomial
distribution is frequently utilized in many domains such as quality control, clinical
trials, and opinion polls, where outcomes may be categorized into two distinct and
non-overlapping groups. In experiments, the binomial distribution is used to
represent the number of successful outcomes, such as right responses, out of a
specific number of independent attempts, such as test questions.
Law et al. (2019) state that the Poisson distribution is useful for representing the
occurrence of infrequent events or incidents within a given time or space period.
The Poisson distribution, named after the French mathematician Siméon Denis
Poisson, is defined by a single parameter (λ), which reflects the average rate at
which the event of interest occurs. The Poisson distribution is especially valuable in
situations where events occur rarely but can be quantified within a specific
timeframe, such as the count of customer arrivals at a service center, the presence of
defects in a manufacturing process, or the frequency of traffic accidents at a
particular intersection. The Poisson distribution facilitates risk assessment, resource
allocation, and decision-making in diverse domains such as insurance, public health,
and urban planning by offering a probabilistic framework to forecast the likelihood
of rare events.

G. Analysis of Correlation and Regression


Cohen et al. (2013) state that correlation analysis and regression modeling are
essential statistical approaches employed to investigate correlations between
variables in research and data analysis. Correlation measures the magnitude and
direction of the relationship between two variables, while regression constructs a
mathematical model to forecast the value of one variable based on the value of
another.
Researchers frequently utilize the Pearson correlation coefficient to evaluate the
extent of link between numerical variables, as mentioned by Gravetter & Forzano
(2018). The coefficient varies between -1 and 1. A value of 1 signifies a complete
positive correlation, meaning that as one variable increases, the other variable also
increases. A value of -1 indicates a complete negative correlation, meaning that as
one variable increases, the other variable decreases. A value of 0 suggests that there
is no linear relationship between the variables. The Pearson correlation coefficient
quantifies the degree and direction of the association between variables, making it
easier to analyze and compare across other datasets.
Linear regression analysis, as explained by Field et al. (2012), is a robust
technique for modeling and predicting the value of a dependent variable based on an
independent variable when a linear relationship exists between the variables. Linear
regression is a statistical method that generates a straight line that closely matches
the observed data points. This enables researchers to make predictions about the
value of the dependent variable based on a given value of the independent variable.
The ability to foresee future trends, identify relevant elements, and make informed
judgments based on empirical facts is highly valuable.

Furthermore, linear regression analysis offers valuable insights into the


characteristics and intensity of the relationship between variables by calculating the
regression coefficients, intercept, and coefficient of determination (R-squared).
These parameters provide useful insights into the gradient of the regression line, the
proportion of variation accounted for by the model, and the importance of the link
between variables.

H. Hypothesis Examination
Researchers view both the null hypothesis and the alternative hypothesis as
essential elements in the hypothesis testing process, as highlighted by Field (2018).
The alternative hypothesis proposes the presence of an effect or difference, in
contrast to the null hypothesis, which suggests the absence of such effects or
differences. These hypotheses serve as the foundation for statistical inference,
directing researchers in assessing the data against the null hypothesis to make
significant inferences about the underlying population.
Researchers follow a complex process in hypothesis testing, which includes
calculating test statistics and interpreting p-values, as described by Sullivan (2011).
Test statistics are numerical summaries of the observed data that make it easier to
compare with theoretical distributions when testing the null hypothesis. Meanwhile,
p-values measure the degree of evidence against the null hypothesis, showing the
likelihood of observing the data or more extreme results assuming that the null
hypothesis is correct.
In addition, the determination of statistical significance relies on predetermined
significance criteria, as mentioned by Rosenthal et al. (2000). These thresholds,
sometimes referred to as alpha (α), indicate the highest permissible likelihood of
making a Type I error, which is the wrong rejection of the null hypothesis when it is
actually true. Typically used significance levels are 0.05, 0.01, or 0.1, representing
different levels of strictness in hypothesis testing. Researchers typically assess the
estimated p-value against a preset significance level in order to determine whether to
accept or reject the null hypothesis. When the p-value is lower than the significance
threshold, usually represented by α, researchers reject the null hypothesis and accept
the alternative hypothesis. This means that they conclude that the observed data
strongly support the existence of an effect or difference. On the other hand, if the p-
value is higher than the significance threshold, researchers do not reject the null
hypothesis, which means there is not enough evidence to establish the existence of
an effect.
CONCLUSION
a. CONCLUSION
Diez et al. (2019) emphasises the crucial importance of statistical
analysis as a fundamental element in the decision-making process driven by
evidence. In today's world, when there is an overwhelming amount of data, it is
essential to have the skills to effectively analyse and extract valuable insights
from this vast amount of information in order to make informed decisions. Rice
(2007) argues that having a strong understanding of basic statistical ideas and
data analysis methods is not just beneficial, but necessary across a wide range of
professional fields.
In the present era of global interconnectivity and information-centricity,
several industries, including banking, marketing, healthcare, and education,
largely depend on data-driven insights to effectively manage intricate challenges
and propel strategic objectives. The knowledge obtained from statistical studies
acts as guiding lights, revealing paths towards innovation, efficiency, and
sustainability.
In addition, the findings presented by Freedman et al. (2007) highlight
the significant capacity for change that exists when statistical approaches are
skillfully applied. Through thorough analysis of datasets, organisations can
reveal concealed patterns, clarify connections, and discover trends that might
otherwise remain hidden. These disclosures, in return, enable decision-makers to
make well-informed decisions, develop efficient strategies, and predict future
patterns with enhanced precision. Moreover, the consequences of skilled
statistical analysis go much beyond the boundaries of organisational decision-
making. Statistical techniques are essential instruments in scientific research for
confirming ideas, testing theories, and expanding knowledge. Researchers are
given a structured framework to analyse empirical data, form conclusions, and
contribute to the overall scientific knowledge.
b. CRITICISM AND SUGGESTION
One of the main criticisms of using statistical data is that it might be
overly dependent on individual data points without taking into account the larger
context. The limited scope of attention can unintentionally promote the
development of incorrect or excessively simplified conclusions, which fail to
fully encompass the intricacy and subtleties present in the subject matter.
Therefore, it is essential to take a comprehensive approach by carefully
analysing not only the individual data points but also the broader context from
which they are obtained.
Upon considering this argument, it becomes evident that a
comprehensive approach is necessary for effectively traversing the complex
domain of statistical data interpretation. This approach requires developing a
strong set of analytical and critical thinking abilities, including a thorough grasp
of statistical methods, assessments of data validity, and the ability to differentiate
between correlations and causations. Furthermore, promoting a culture of
curiosity and doubt, where relevant inquiries are supported and diligently
explored, is essential for improving the accuracy and dependability of statistical
studies.
Moreover, it is essential for both practitioners and stakeholders to give
priority to the promotion of transparency and clarity when sharing statistical
findings. To enhance trust and confidence in the accuracy of statistical data, it is
important to openly share the complex details of how data is collected,
acknowledge the uncertainties involved, and clearly define the limitations of
statistical inferences. This transparent approach not only promotes well-
informed decision-making but also encourages a stronger and more open
conversation between those who produce data and those who use it. This
ultimately strengthens the reliability and usefulness of statistical information that
is available to the public.
REFERENCES
Agresti, Alan. Foundations of Statistical Inference. John Wiley & Sons, 2018.
Everitt, Brian S., and Torsten Hothorn. An Introduction to Applied Multivariate
Statistical Analysis. Springer, 2018.
Gelman, Andrew, et al. Bayesian Data Analysis. Chapman and Hall/CRC, 2013.
Hastie, Trevor, et al. The Elements of Statistical Learning: Data Mining, Inference, and
Prediction. Springer, 2009.
Hoel, Paul G., et al. Introduction to Mathematical Statistics. Wiley, 1971.
Montgomery, Douglas C., et al. Introduction to Linear Regression Analysis. Wiley,
2018.
Rao, C. R. Linear Statistical Inference and Its Applications. Wiley, 2014.
Wilks, Samuel S. Mathematical Statistics. John Wiley & Sons, 2014.
Rencher, Alvin C., and G. Bruce Schaalje. Linear Models in Statistics. John Wiley &
Sons, 2008.
Shumway, Robert H., and David S. Stoffer. Time Series Analysis and Its Applications:
With R Examples. Springer, 2017.
Casella, George, and Roger L. Berger. Statistical Inference. Cengage Learning, 2008.
Hogg, Robert V., et al. Introduction to Mathematical Statistics. Pearson, 2013.
Johnson, Richard Arnold, and Dean W. Wichern. Applied Multivariate Statistical
Analysis. Pearson, 2007.
Kutner, Michael H., et al. Applied Linear Statistical Models. McGraw-Hill Education,
2004.
Lehmann, E. L., and George Casella. Theory of Point Estimation. Springer Science &
Business Media, 2006.
Rice, John A. Mathematical Statistics and Data Analysis. Cengage Learning, 2006.
Wasserman, Larry. All of Statistics: A Concise Course in Statistical Inference. Springer
Science & Business Media, 2013.
Box, George EP, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis:
Forecasting and Control. John Wiley & Sons, 2015.
DeGroot, Morris H., and Mark J. Schervish. Probability and Statistics. Pearson
Education, 2011.
Hastie, Trevor, et al. Statistical Learning with Sparsity: The Lasso and Generalizations.
CRC Press, 2015.

You might also like