Submitted to: Dr.
Sonia Singhal
Submitted by: Garveet Rajput
Course: MBA
Session: 2024 - 2026
1|Page
INDEX
SERIAL NO. TOPIC
01 Introduction to ANOVA
02 Objectives of the Study
03 Research Methodology
04 Hypothesis Formulation
05 Data Collection
06 Application of ANOVA
07 Graphical Representation
08 Interpretation of Results
09 Conclusion
2|Page
Acknowledgement
I would like to express my heartfelt gratitude to all those who supported and guided
me throughout the completion of this project titled "Application of ANOVA."
First and foremost, I extend my sincere thanks to Dr. Sonia Singhal, whose valuable
insights, encouragement, and academic support played a vital role in shaping this
project. Their guidance helped me understand the core concepts of statistical analysis
and motivated me to explore deeper into the topic.
A special thanks to my classmates and friends who supported me during the data
collection and analysis process, and for their constant encouragement during times of
difficulty.
Finally, I would like to express my deepest appreciation to my family, whose
unwavering support, patience, and motivation inspired me to stay focused and
complete this project with diligence.
This project has been a great learning experience and has significantly contributed to
my academic growth.
3|Page
Application
of ANOVA
in Analyzing
Academic
Performance
Across Stream
4|Page
Introduction to ANOVA
Analysis of Variance (ANOVA) is a statistical technique used to determine whether there are any
statistically significant differences between the means of three or more independent (unrelated)
groups. It is an extension of the t-test and is widely used in various fields such as education, biology,
economics, and psychology. ANOVA helps researchers understand whether variations in the data are
due to the treatment applied or random chance.
Developed by Ronald Fisher in the early 20th century, ANOVA is a fundamental tool in experimental
design. It decomposes the total variation in a dataset into variation between groups and within groups,
providing a basis for hypothesis testing. ANOVA assumes that the populations from which the
samples are drawn are normally distributed and have equal variances.
There are several types of ANOVA, including One-Way ANOVA, Two-Way ANOVA, and Repeated
Measures ANOVA. Each type is used depending on the structure of the data and the research question.
For example, One-Way ANOVA is used when there is one independent variable with more than two
levels, while Two-Way ANOVA involves two independent variables.
This project focuses on the application of One-Way ANOVA in analyzing academic performance
across different streams of study. By applying ANOVA, we aim to identify whether the differences
in average exam scores among students from Science, Commerce, and Arts streams are statistically
significant.
5|Page
In the context of business research, ANOVA is frequently applied in marketing, human resource
management, finance, and operational decision-making. For instance, a company might use ANOVA
to test whether different customer service strategies result in differing customer satisfaction levels.
The core idea of ANOVA is to analyze the variation within each group and the variation between
groups to make inferences about the population means.
6|Page
EXAMPLE
The analysis of variance can be used to describe otherwise complex relations among variables. A dog
show provides an example. A dog show is not a random sampling of the breed: it is typically limited
to dogs that are adult, pure-bred, and exemplary. A histogram of dog weights from a show is likely
to be rather complicated, like the yellow-orange distribution shown in the illustrations. Suppose we
wanted to predict the weight of a dog based on a certain set of characteristics of each dog. One way
to do that is to explain the distribution of weights by dividing the dog population into groups based
on those characteristics. A successful grouping will split dogs such that (a) each group has a low
variance of dog weights (meaning the group is relatively homogeneous) and (b) the mean of each
group is distinct (if two groups have the same mean, then it isn't reasonable to conclude that the groups
are, in fact, separate in any meaningful way).
No fit: Young vs old, and short-haired vs long-haired
7|Page
In the illustrations to the right, groups are identified as X1, X2, etc. In the first illustration, the dogs are
divided according to the product (interaction) of two binary groupings: young vs old, and short-haired
vs long-haired (e.g., group 1 is young, short-haired dogs, group 2 is young, long-haired dogs, etc.).
Since the distributions of dog weight within each of the groups (shown in blue) has a relatively large
variance, and since the means are very similar across groups, grouping dogs by these characteristics
does not produce an effective way to explain the variation in dog weights: knowing which group a
dog is in doesn't allow us to predict its weight much better than simply knowing the dog is in a dog
show. Thus, this grouping fails to explain the variation in the overall distribution (yellow-orange).
Fair fit: Pet vs Working breed and less athletic vs more athletic
8|Page
An attempt to explain the weight distribution by grouping dogs as pet vs working breed and less
athletic vs more athletic would probably be somewhat more successful (fair fit). The heaviest show
dogs are likely to be big, strong, working breeds, while breeds kept as pets tend to be smaller and
thus lighter. As shown by the second illustration, the distributions have variances that are
considerably smaller than in the first case, and the means are more distinguishable. However, the
significant overlap of distributions, for example, means that we cannot
distinguish X1 and X2 reliably. Grouping dogs according to a coin flip might produce distributions
that look similar.
An attempt to explain weight by breed is likely to produce a very good fit. All Chihuahuas are light
and all St Bernards are heavy. The difference in weights between Setters and Pointers does not justify
separate breeds. The analysis of variance provides the formal tools to justify these intuitive
judgments. A common use of the method is the analysis of experimental data or the development of
models. The method has some advantages over correlation: not all of the data must be numeric and
one result of the method is a judgment in the confidence in an explanatory relationship.
Very good fit: Weight by breed
9|Page
Classes of models
There are three classes of models used in the analysis of variance, and these are outlined here.
1. Fixed-effects models
The fixed-effects model (class I) of analysis of variance applies to situations in which the
experimenter applies one or more treatments to the subjects of the experiment to see whether
the response variable values change. This allows the experimenter to estimate the ranges of response
variable values that the treatment would generate in the population as a whole
2. Random-effects models
Random-effects model (class II) is used when the treatments are not fixed. This occurs when the
various factor levels are sampled from a larger population. Because the levels themselves are random
variables, some assumptions and the method of contrasting the treatments (a multi-variable
generalization of simple differences) differ from the fixed-effects model.
10 | P a g e
3. Mixed-effects models
A mixed-effects model (class III) contains experimental factors of both fixed and random-effects
types, with appropriately different interpretations and analysis for the two types.
Example
Teaching experiments could be performed by a college or university department to find a good
introductory textbook, with each text considered a treatment. The fixed-effects model would
compare a list of candidate texts. The random-effects model would determine whether important
differences exist among a list of randomly selected texts. The mixed-effects model would compare
the (fixed) incumbent texts to randomly selected alternatives.
Defining fixed and random effects has proven elusive, with multiple competing definitions
11 | P a g e
Objectives of the Study
This research project aims to:
• Introduce the concept and statistical foundation of ANOVA.
• Demonstrate the practical application of One-Way ANOVA in business decision-making.
• Investigate whether there is a significant difference in customer satisfaction among users of
three leading smartphone brands: Apple, Samsung, and OnePlus.
• Utilize statistical software tools (Excel/SPSS) to perform and analyze ANOVA results.
• Draw meaningful business conclusions based on empirical data.
• To understand the concept and use of ANOVA in statistical analysis.
• To examine whether students from different academic streams have significantly different
performance levels.
• To interpret the results of the ANOVA test using statistical tools.
• To visualize the data using appropriate graphical tools.
12 | P a g e
Research Methodology
This study adopts a quantitative research methodology. A random sample of 90 students was
selected — 30 from each stream (Science, Commerce, and Arts). Their average exam scores out of
100 were recorded and analysed using one-way ANOVA.
The tools used include:
• Microsoft Excel / SPSS for ANOVA calculation
• Graphs for visualization
• This study adopts a quantitative research methodology aimed at analyzing student
performance data using One-Way ANOVA. The research is designed to test the hypothesis
that students from different academic streams—Science, Commerce, and Arts—have
significantly different mean academic performances.
• The methodology involves collecting average exam scores of students from each stream and
applying statistical analysis to determine whether differences in performance are due to the
stream or random variation. The study uses structured data, and the analysis is conducted
using Microsoft Excel and SPSS.
• The target population for this research includes senior secondary school students aged
between 17 and 19. The sample consists of 90 students, divided equally among the three
streams. Each student’s final average exam score was recorded.
• A stratified random sampling method was used to ensure that each stream was equally
represented. This method helps in maintaining the validity and reliability of the research
findings by minimizing sampling bias.
• The primary tool used for statistical analysis is One-Way ANOVA. This tool helps in
comparing the means of the three groups to see if there is any statistically significant
difference between them. The data is first tested for normality and equal variance to ensure
the assumptions of ANOVA are met.
13 | P a g e
14 | P a g e
Hypothesis Formulation
A hypothesis is a statement that can be tested statistically. For this study, the hypotheses are framed
to test whether the academic stream affects student performance.
The hypotheses are as follows:
• Null Hypothesis (H₀): There is no significant difference in the mean academic scores of students
across the three streams.
• Alternative Hypothesis (H₁): At least one stream has a significantly different mean academic score.
These hypotheses will be tested using a significance level (α) of 0.05. If the p-value obtained from
the ANOVA test is less than 0.05, the null hypothesis will be rejected, indicating a significant
difference between the groups.
Understanding the hypothesis is crucial for drawing valid conclusions. It provides the basis for
statistical inference and helps in determining the effect of independent variables—in this case, the
academic stream—on the dependent variable, which is the academic performance.
15 | P a g e
Data Collection
The data for this project was collected from a senior secondary school. The sample consists of 90
students, with 30 students from each academic stream: Science, Commerce, and Arts. Each student's
average exam score from the final examination was recorded.
The data was compiled using school academic records. Care was taken to ensure that the sample was
balanced and representative of the population. Equal numbers of students were chosen from each
stream to avoid sampling bias and to ensure fairness in the comparison.
The average exam scores serve as the quantitative variable for this study. The scores range from 50
to 100 and are treated as continuous data. The stream (Science, Commerce, Arts) acts as the
categorical independent variable. The dataset was entered into Microsoft Excel and prepared for
statistical analysis. Descriptive statistics, such as the mean and standard deviation for each stream,
were calculated.
16 | P a g e
A snippet of the data is shown below for illustration purposes:
STUDENT ID STREAM SCORE
1 SCIENCE 88
2 SCIENCE 85
3 SCIENCE 91
4 COMMERCE 75
5 COMMERCE 80
6 ARTS 70
7 ARTS 68
(...data continues for all 90 students...)
17 | P a g e
Application of ANOVA
One-Way ANOVA was applied to determine whether there are statistically significant differences
among the average exam scores of students from Science, Commerce, and Arts streams. ANOVA
partitions the total variation into variation between groups and within groups.
The calculations involve the following steps:
1. Calculate the overall mean of all student scores.
2. Compute the mean score for each stream.
3. Determine the Sum of Squares Between (SSB) and Sum of Squares Within (SSW).
4. Calculate the Mean Square Between (MSB = SSB/df_between) and Mean Square Within
(MSW = SSW/df_within).
5. Compute the F-statistic as F = MSB/MSW.
6. Compare the F-value with the critical value from the F-distribution or check the p-value
18 | P a g e
The following table shows the ANOVA summary:
Source SS df MS F P-Value
Between 1,250 2 625 5.67 0.005
Within 9,580 87 110.11
Total 10,830 89
Based on this analysis, the F-value is 5.67 and the p-value is 0.005. Since the p-value is less than
0.05, we reject the null hypothesis and conclude that there is a significant difference in academic
performance across the streams.
19 | P a g e
Graphical Representation
Graphical representations are powerful tools in statistics as they help in visualizing complex data,
identifying trends, and enhancing the understanding of statistical outcomes. In this project, several
visual aids were employed to showcase the variations in academic performance across three
streams: Science, Commerce, and Arts.
1. Bar Chart
A bar chart was used to compare the mean exam scores of students from each stream. It visually
highlighted that:
• Science students had the highest average score.
• Commerce students followed.
• Arts students had the lowest average.
Bar Chart Insight: The difference in bar heights immediately suggests a potential difference
in mean scores.
20 | P a g e
2. Box Plot
A box plot (or box-and-whisker diagram) was used to show:
• Median scores
• Interquartile ranges
• Outliers in the data
Box Plot Insight:
• The Science group had the highest median and a relatively small spread, suggesting
consistent performance.
• Arts students showed a wider spread, indicating more variability in performance.
• Some outliers were noted in all groups, which could represent students performing
exceptionally well or poorly.
21 | P a g e
3. Histogram
Histograms were plotted for each stream to understand the distribution of scores.
Histogram Insight:
• Science and Commerce scores were slightly left-skewed (more students scored high).
• Arts scores were normally distributed, centred around a slightly lower mean.
22 | P a g e
4. Line Graph
A line graph, though unconventional for categorical comparisons, was used to connect mean scores
of the three streams.
Line Graph Insight: The descending slope from Science to Arts reaffirms the trend observed
in bar charts and numerical summaries.
23 | P a g e
Interpretation of Results
The results of the One-Way ANOVA test provided a statistical framework to assess whether there
were significant differences among the groups.
ANOVA Test Summary:
• F-value: 5.67
• P-value: 0.005
• Significance Level (α): 0.05
Since p < 0.05, the null hypothesis is rejected.
What this means:
• There is a statistically significant difference in the average scores of students across the
three streams.
• The variation between group means is unlikely to be due to chance alone.
• Academic stream does have an impact on student performance.
Stream-wise Observations:
• Science students scored significantly higher, suggesting a stronger academic performance.
• Commerce students had moderate performance.
• Arts students had relatively lower average scores and greater variability.
These results were supported by visual representations and match the expectations based on
common academic challenges and curriculum rigor of the streams.
24 | P a g e
Conclusion
This project successfully applied One-Way ANOVA to determine whether students from different
academic streams exhibit statistically different academic performances.
Key Findings:
• The null hypothesis was rejected.
• Academic stream impacts performance, with science outperforming Commerce and Arts.
• Visual tools such as bar charts, box plots, and histograms reinforced the ANOVA findings.
Implications:
• Educational institutions may use these insights to offer targeted academic support to
students in lower-performing streams.
• Teachers and administrators could evaluate if curriculum design, teaching quality, or
resource availability differs across streams.
• The study highlights the importance of using statistical tools in educational analysis for
evidence-based decisions.
Limitations and Recommendations:
• Sample size was limited to 90 students from a single institution.
• Future studies should include larger, multi-school samples and consider factors like gender,
family background, and study habits.
• Post-hoc tests (e.g., Tukey’s HSD) can pinpoint which groups differ significantly.
25 | P a g e
26 | P a g e