0% found this document useful (0 votes)
13 views26 pages

Module 5 Stem and Leaf Plot

The document discusses the star plot, a method for displaying multivariate data, where each star represents a single observation with spokes for each variable. It outlines the definition, applications, weaknesses, and implementation of star plots, as well as providing examples of data analysis using stem-and-leaf plots. Additionally, it includes procedures for creating these plots and interpreting their results.

Uploaded by

sudeep shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views26 pages

Module 5 Stem and Leaf Plot

The document discusses the star plot, a method for displaying multivariate data, where each star represents a single observation with spokes for each variable. It outlines the definition, applications, weaknesses, and implementation of star plots, as well as providing examples of data analysis using stem-and-leaf plots. Additionally, it includes procedures for creating these plots and interpreting their results.

Uploaded by

sudeep shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

EDA Graphical Presentations

Problems on Star Plot


Purpose : Display Multivariate Data
• Star plot is a method of displaying multivariate data.
• Each star represents a single observation.
• Typically, star plots are generated in a multi-plot format with many
stars on each page and each star representing one observation.
• Star plots are used to examine the relative values for a single data
point and to locate similar points or dissimilar points.
Definition:
• The star plot consists of a sequence of equi-angular spokes,
called radii, with each spoke representing one of the variables.
• The data length of a spoke is proportional to the magnitude of the
variable for the data point relative to the maximum magnitude of
the variable across all data points.
• A line is drawn connecting the data values for each spoke.
• This gives the plot a star-like appearance and the origin of the
name of this plot.
Why we use Star plot?
• Star plots are used to examine the relative values for a single
data point (e.g., point 3 is large for variables 2 and 4, small for
variables 1, 3, 5, and 6) and to locate similar points or
dissimilar points. We can look at these plots individually
or we can use them to identify clusters of cars with similar
features.
What are Star plots?
• Star plots(also known as spider charts, polar charts, web charts, or
radar chart) are a way to visualize multivariate data. They
are used to plot one or more groups of values over multiple
common variables.
Weakness in Technique:
• Star plots are helpful for small-to-moderate-sized multivariate
data sets. Their primary weakness is that their effectiveness is
limited to data sets with less than a few hundred points. After that,
they tend to be overwhelming.
Applications of Star Plot:
• One application of radar charts is the control of quality
improvement to display the performance metrics of any
ongoing program.
• They are also used in sports to chart players strengths and
weaknesses, where they are usually called radar charts.
Implementation of star plot:
> require(grDevices)
>stars(mtcars[, 1:7], key.loc = c(14, 2),main = "Motor Trend Cars : stars(*, full = F)", full =
FALSE)
Continued…
>stars(mtcars[, 1:7], key.loc = c(14, 1.5), main = "Motor Trend Cars : full stars()",
flip.labels = FALSE)
Continued...
>stars(mtcars[, 1:7], locations = c(0, 0), radius = FALSE, key.loc = c(0, 0), main = "Motor
Trend Cars", lty = 2)
Year College A College B Difference
2007
2008
2009
2010
2011
2012

1. In which year was the difference between number of students in College A and number of students in College B the
highest and what is the difference?(Student values in Thousands).

2. If 25% students in College A in 2010 were females, then what was the number of male students in College A in same
year?
Year College A College B Difference
2007 20 10 10
2008 25 15 10
2009 35 25 10
2010 15 20 5
2011 30 25 5
2012 20 35 15
145 130

1. In which year was the difference between number of students in College A and number of students in College B the
highest and what is the difference?(Student values in Thousands).

Ans: 2012, 15000

2. If 25% students in College A in 2010 were females, then what was the number of male students in College A in same
year?
Ans: In College A, in 2010, 25% students are female, then in the same year, 75% were males.
Then 75% of 15000=11250
The number of passengers travelled(in lakhs) in four different trains in different years is shown below in a star plot.

Observe the graph and answer the following questions:


a) In which train is the number of passengers travelled the maximum during the eight years?
b) If the fare of Shatabdi Exp is Rs.400 for all classes of passenger and the fare of Sampark Kranti Exp is 20% more than that of
Shatabdi Exp, then what is the ratio of the income of Shatabdi Exp in 2012 to that of Sampark Kranti Exp in 2013?
Year Shatabdi Sampar Sapt Rajdhani
Exp k Kranti Kranti Exp
Exp Exp

2006 6 7
2007 7
2008 4
2009 5
2010 5
2011 7
2012 6
2013 3
Total 43 36
Year Shatabdi Sampark Sapt Rajdhani a) In which train is the number of passengers
Exp Kranti Kranti Exp travelled the maximum during the eight years?
Exp Exp Ans: Shatabdi Exp
2006 6 7 1 2
2007 7 1 4 6 b) Fare of Shatabdi Express is: Rs. 400/- for all
2008 4 6 5 1 classes.
2009 5 7 4 3
Fare of Sampark Kranti Exp is 20% more than
2010 5 1 2 4 that of Shatabdi Exp.
2011 7 1 3 5 i.e Rs.400+20% of Rs.400=400+80=480.
2012 6 7 4 2 Income of Shatabdi Exp in 2012 is: 6x400
Income of Sampark Kranti Exp in 2013 is:
2013 3 6 5 4 6 x 480
Total 43 36 28 27
The ratio is: (6x400)/(6x480)
=5:6
Stem and leaf plot
• A stem-and-leaf plot is an arrangement of digits that is used to display and order
numerical data. OR
• Stem-and-leaf plot is a tabular presentation where each data value is split into a
“stem” (the first digit or digits) and a “leaf” (usually the last digit).
• It provides a visual summary of data and is mainly suitable for smaller data sets.
• "32" is split into "3" (stem) and "2" (leaf).
Procedure to make stem-and-leaf plot:
• Sort the data from low to high.
• Separate each observation/data into a stem which will consist of all except
rightmost digit and leaf, the rightmost digit.
• Leaf must have only one digit while stem can have as many digits as possible.
• Write the stem in a vertical column with smallest at the top then draw a vertical
line by the right of this column.
• Write each corresponding leaf in the row to the right of its stem just after the
vertical line, in ascending order out from the stem.
Draw the stem and leaf for the following data:
13, 22, 44, 53, 20, 42, 16, 52, 41, 24

Arrange the data from low to high:


13,16,20,22,24,41,42,44,52,53
Frequ Stem Leaf
ency 13,16
Count 20,22,24
2 1 36 41,42,44,
5 2 024
52,53
5 3
8 4 124
10 5 23
Stemgraphic module

• Import sys
• !pip install stemgraphic
# importing the module
import stemgraphic
data = [13, 22, 44, 53, 20, 42, 16, 52, 41, 24]
# calling stem_graphic with required parameters, data and scale
stemgraphic.stem_graphic(data, scale = 10,asc=False)
[2.3,2.5, 2.5, 2.7, 2.8, 3.2, 3.6, 3.6, 4.5, 5.0]
# importing the module
import stemgraphic
data = [2.3,2.5, 2.5, 2.7, 2.8, 3.2, 3.6, 3.6, 4.5, 5.0]
# calling stem_graphic with required parameters, data and scale
stemgraphic.stem_graphic(data, scale = 1,asc=False)
# import matplotlib.pyplot library
import matplotlib.pyplot as plt

data = [16, 25, 47, 56, 23, 45, 19, 55, 44, 27]

# separating the stem parts


stems = [1, 1, 2, 2, 2, 4, 4, 4, 5, 5]

plt.ylabel('Data') # for label at y-axis

plt.xlabel('stems') # for label at x-axis

plt.xlim(0, 10) # limit of the values at x axis

plt.stem(stems, data) # required plot


Construct and interpret the key results of stem and leaf plot for the given data.
Test Scores= [66,70,75,77,78,81,81,83,84,84,86,88,88,89,90,92,99,100]

a) Find the range of the data in the given stem and leaf plot.
b) How many students scored at least 90 marks?
c) How are the data distributed?
d) Calculate the mode and median of the given plot.
e) How many students received grade B and grade C on the test? For
grade calculation consider the data shown below.
• Find the range of the data in the given stem and leaf plot.
Range=max.value-min.value=100-66=34
• How many students scored at least 90 marks?
4 students scored at least 90 marks.

• Calculate the mode and median of the given plot.


To find the median: First, list the values from smallest to largest.
66,70,75,77,78,81,81,83,84,84,86,88,88,89,90,92,99,100
Next, find the mean of the two middle numbers: (84+84)/2=85
Mode: 81, 84, 88
• Write a Python code to implement stem and leaf plot.(3M)
• import pandas as pd
• import stemgraphic as st
• data=[66,70,75,77,78,81,81,83,84,84,86,88,88,89,90,92,99,100]
• data1=pd.Series(data)
• fig,ax=st.stem_graphic(data1)
Problems

Draw the stem and leaf plot for the following data:

1. [78,84,60,62,72,79,64,81,72,51,88,84,93,98,57,72,79,74,81,60,59]
1. [15,27,8,17,13,22,24,25,13,36,32,32,32,28,43,7]

You might also like