12 IP 2022 Data Visualization
12 IP 2022 Data Visualization
SOLVED QUESTIONS
1. What is matplotlib?
Ans. matplotlib is a plotting library used for creating 2D graphics in Python programming language. It can be
used in Python scripts, shell, web application servers and other graphical user interface toolkits.
2. How can we install matplotlib?
Ans. It is easyto install Python matplotlib library with pip statement— pip install matplotlib.
3. What are the various types of plots offered by matplotlib?
Ans. Matplotlib offers several types of plots:
• Line Graph • Bar Graph
• Histogram • Scatter Plot
• Area Plot • Pie Chart
4. What is data visualization? What is its significance?
Ans. Data visualization is a general term that describes any effort to help people understand the significance of
data by placing it in a visual context. In simple words, Data visualization is the process of displaying data/
information in graphical charts, figures and bars.
5. Name the functions you will use to create a (i) line chart, (ii) bar chart, (iii) scatter chart.
Ans. (i) matplotlib.pyplot.plot()
(ii) matplotlib.pyplot.bar()
(iii) matplotlib.pyplot.plot() and matplotlib.pyplot.scatter()
6. Mr. Sanjay wants to plot a bar graph for the given set of values of subject on x-axis and number of students
who opted for that subject on y-axis. [CBSE Sample Paper 2020]
Complete the code to perform the following:
(i) To plot the bar graph in statement 1
(ii) To display the graph in statement 2
import matplotlib.pyplot as plt
x= [ 'Hindi ' , 'English', ' Science' , ' SST' ]
y=[10,20,30,40]
Statement 1
Statement 2
Ans. (i) plt.bar(x, y) (ii) plt.show()
7. Mr. Harry wants to draw a line chart using a list of elements named LIST. Complete the code to perform
the following operations:
(i) To plot a line chart using the given LIST,
(ii) To give a y-axis label to the line chart named "Sample Numbers".
import matplotlib.pyplot as PLINE
LIST=[10,20,30,40,50,60]
Statement 1
Statement 2
PLINE . show ( )
Ans. (i) PLINE.plot (LIST)
(ii) PLINE.ylabel ("Sample Numbers")
8. Write a code to plot the speed of a passenger train as shown in the figure given below:
Assume the values for x from 1 to 5 and the other axis is taken as 1.5x, 3.0x and x/3.0.
12
— Normal
10 — Fast
— Slow
play
daily_work
Ans. import matplotlib.pyplot as plt
The fight arrival delays are in minutes and negative values mean the flights came early. There are over
300,000 flights with a minimum delay of -60 minutes and a maximum delay of 120 minutes. The other
column is the name of the airline which we can use for comparisons.
Ans. For the plot calls, we specify the binwidth by the number of bins. For this plot, we will use bins that are 5
minutes in length, which means that the number of bins will be the range of the data (from -60 to 120
minutes) divided by the binwidth, i.e., 5 minutes (bins = int(180/5)).
# Import the libraries
import matplotlib.pyplot as pit
flights_arr_delay= [-25.0, -18.0, -14.0, Histogram of Arrival Delays
-8.0,8.0,11.0,12.0,1 35000 -
9.0,20.0,33.0] 30000j
# matplotlib histogram
25000
plt.hist (flights_arr_delay, color =
'blue', edgecolor = 'black', 20000
Men: [113,85,90,150,149,88,93,115,135,80,77,82,129]
5
Women: [67,98,89,120,133,150,84,69,89,79,120,112,100]
Ans. import matplotlib.pyplot as plt I4
blood_sugar_men = [113,85,90,150,149,88,93,
203
115,135,80,77,82,129] 7.3
22
blood_sugar_women = [67,98,89,120,133,150,84,
69,89,79,120,112,100]
plt.xlabel('sugar range') 0 80 90 100 110 120 130 140
plt.ylabel('Total no. of patients') sugar range
150
plt.xlabel('sugar range')
plt.ylabel ( 'Total no. of patients')
plt.title ( 'Blood sugar analysis') auo
®
I
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
I I I I I I I I I I I I I I I I
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
I I I I I I I I I I I I I I I I
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
I
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
16,16,16,20,21,21,23,25,26,26,28,28
Min = 16
21+23
Median (Q2) = = 22
2
Max = 28
16,16,16,20,21,21,23,25,26,26,28,28
16 + 20
Q1 - 2 _18
26 + 26
Q3 = = 26
2
29. Which data set could be represented by the Box plot shown below?
I
0 2 4 6 8 10 12 14 16 18 20
0 3,4,8,9,9,10,12,13,13,16,18
0 3,4,7,9,9,10,12,13,13,16,18
0 3,4,8,9,9,12,12,13,13,16,18
0 2,4,7,9,9,10,12,13,13,16,18
Ans.
Q1 median Q3
min max
•
0 2 4 6 8 10 12 14 16 18 20
Q1 median Q3
min max
• •
0 2 4 6 8 10 12 14 16 18 20
30. Suppose that the box-and-whisker plots below represent quiz scores out of 25 points for Quiz 1 and
Quiz 2 for the same class. What do these box-and-whisker plots show about how the class did on
test #2 compared to test #1?
Test 2 I
Test 1 I
I I I I I I I I I I I I I I I I I I I I I I I I I
0 2 4 6 8 10 12 14 16 18 20 22 24
Ans. These box-and-whisker plots show that the lowest score, the highest score and Q3 are all the same for both
the exams. So performance on the two exams was quite similar. However, the movement Q1 up from a
score of 6 to a score of 9, indicates that there was an overall improvement. On the first test, approximately
75% of the students scored at or above a score of 6. On the second test, the same number of students
(75%) scored at or above a score of 9.
31. Which module needs to be imported for showing data in a chart form?
Ans. pyplot module of matplotlib library
32. Plot a line chart to depict a comparison of population between India and Pakistan.
Ans. #Population comparison between India and Pakistan
from matplotlib import pyplot as plt
year = [1960, 1970, 1985., 1990, 2000, 2010]
popul_pakistan = [44.91, 58.09, 78.07, 107.7, 138.5, 170.6]
popul_india = [449.48, 553.57, 696.783, 870.133, 1000.4, 1309.1]
plt.plot (year, popul_pakistan, color='green')
plt.plot (year, popul_india, color=iorangel)
plt.xlabel ( 'Year' )
plt.ylabel ( 'Population in million')
plt.title('India v/s Pakistan Population till 2010')
plt.show()
1200 -
1000 -
? wo
8
600
a
400 -
200 -
0
1960 1970 1980 1990 2000 2010
Year
6-
E
4.
2-
1 33 4
34. Write a program to plot list elements between X-axis and Y-axis.
Ans. #Program to plot a simple Line chart holding list elements
#against x-axis and y-axis respectively
import matplotlib.pyplot as plt
plt.plot([1,2,3,4],[1,4,9,16])
plt.xlabel("label for X axis")
plt.ylabel("label for Y axis")
plt.show()
*1+41. +1ctIrt1
35. Plot a bar chart to depict the popularity of various programming languages.
Ans. #Program to plot a Bar chart on the basis of popularity of Programming Languages
import numpy as np
import matplotlib.pyplot as plt
objects = ('DotNet', 'C++', 'Java', 'Python', 'C', 'CGI/PERL' )
y_pos = np. arange (len (objects) )
performance = [8,10,9,20,4,1]
plt.bar(y_pos, performance, align=' center' , color='blue')
plt.xticks(y_pos, objects) #set location and label
plt.ylabel ( 'Usage' )
plt.title ( 'Programming language usage')
plt.show()
17.5
15.0
12.5
g10.0
7.5
5.0
2.5
0.0
DotNet C++ Ova PYth00 C CGI/F£RL
4.1+41 +Az-
36. Write a Python program to draw a line with a suitable label in the X-axis and Y-axis, and a title. The code
snippet gives the output shown in the following screenshot:
_
Draw a line.
140
120
100
x 80-
60 -
40 -
20 -
o.
0 10 20 30 50
X-axis
*I++ 42A17-d
Ans. import matplotlib.pyplot as plt
X = range (l, 50)
Y = [value * 3 for value in X]
print("Values of X:")
print(range(1,50))
print("Values of Y (thrice of X):")
print(Y)
# Plot lines and/or markers to the Axes.
plt.plot(X, Y)
# Set the x axis label of the current axis.
plt.xlabel('x - axis')
# Set the y axis label of the current axis.
plt.ylabel('y - axis')
# Set a title
plt.title('Draw a line.')
# Display the figure.
plt.show()
>>>
RESTART: C:Msers/preeti/AppData/Local/Programs/Python/Python37-32/prog_s
olvequeslpyplot.py
Values of X:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
Values of Y (thrice of X):
[3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57,
60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111
r, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147)
>»
V
Ltv 9 Cok 4
37. Write a Python program to draw a line using given axis values with a suitable label in the X-axis and Y-axis,
and a title. The code snippet gives the output as shown in the following screenshot:
Two or more lines with different widths and colors with suitable legends
40 - linel-width-3
- line2-width-5
35 -
30 -
25 -
20 -
15 -
10 -
35 -
30-
•
N •
g 25 - •
.4e
20 - •
• -
15 - •
`'•
10 -
Scores by person
35
25
20
15
10
0
0 4 6
Person
411<-1+) +14P-F-i
Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Inte
1)) on win32
Type "copyright", "credits" or "license()" for more information.
RESTART: C:/Users/preeti/AppData/Local/Programs/Python/Python37-32/prog_pyplot3
.PY
Enter size of the square: 10
re
0-
—1
1
• -•
•
4
•
1©
42. Depict the relationship between Unemployment Rate and Stock Index Price through a scatter plot.
Unemployment_Rate Stock_Index_Price
6.1 1500
5.8 1520
5.7 1525
5.7 1523
5.8 1515
5.6 1540
5.5 1545
5.3 1560
5.2 1555
5.2 1565
1540 •
1530
1520 •
•
1510
1500 •
UNSOLVED QUESTIONS
1. Plot a line chart for depicting the population for the last 5 years as per the specifications given below:
• plt.title("My Title") will add a title "My Title" to your plot.
• plt.xlabel("Year") will add a label "Year" to your X-axis.
• plt.ylabel("Population") will add a label "Population" to your Y-axis.
• plt.xticks([1, 2, 3, 4, 5]) set the numbers on the X-axis to be 1, 2, 3, 4, 5. Pass it and label as a second
argument. For example, if we use this code plt.xticks([1, 2, 3, 4, 5], r1M", "2M", "3M", "4M", "5M1),
it will set the labels 1M, 2M, 3M, 4M, 5M on the X-axis.
• plt.yticks() — works the same as plt.xticks(), but for the Y-axis.
2. What is matplotlib?
3. What do you mean by pyplot?
4. How many types of graphs are plotted using pyplot?
5. Explain the utility of explode().
6. How is a pie chart different from a bar graph?
7. Which function is used to show the graph?
8. Write a Python program to draw a line with a suitable label in the X-axis and Y-axis, and a title.
9. Write a Python program to plot two or more lines with legends, different widths and colours.
10. Write a Python program to plot two or more lines and set the line markers.
11. Write a Python program to create a pie chart with the title of the Stream and percentage of Student's
Strength. Make multiple wedges of the pie.
Sample data:
Stream : Science, Commerce, Humanities, Vocational, FMM
Strengths: 29%, 30%, 21%, 13%, 7%
12. Write a Python program to display a bar chart of the number of students in a class. Use different colours
for each bar.
Sample data:
Class :1,11,111,1V,V,VI,VII,V111,1X,X
Strengths: 40,43,45,47,49,38,50,37,43,39
13. Write a Python program to display a horizontal bar chart of the number of students in a class.
Sample data:
Class :1,11,111,1V,V,VI,VII,V111,1X,X
Strengths: 40,43,45,47,49,38,50,37,43,39
14. Plot a pie chart of a class test of 40 students based on random sets of marks obtained by the students
(MM=100).
15. Plot a line graph for: y2=4*x
16. Write a Python program to plot the function y = x2 using the matplotlib library.
17. Name the various methods used with pyplot object.
18. Write the specific purpose of the following functions used in plotting:
(i) shape() (ii) legend()
19. Plot a histogram of a class test of 40 students based on random sets of marks obtained by the students
(MM=100).
20. A list, namely temp contains average temperature for seven days of last week. You want to see how the
temperature changes in the last seven days. Which chart type will you plot for the same and why?
21. Write the code to practically produce a chart of the previous question (i) using line chart, (ii) using scatter chart.
22. Collect data about colleges in Delhi University or any other university of your choice and number of courses
they run for Science, Commerce and Humanities, store it in a CSV file and present it using a bar plot.
23. Collect and store data related to the screen time of students in your class separately for boys and girls and
present it using a boxplot.
24. What is a histogram? How do you create histograms in Python?
25. What are the various types of histograms that can be created through hist() function?
26. When should you create histograms and when should you create bar charts to present data visually?
27. What is a frequency polygon? How do you create it?
28. What is Box plot? How do you create it in Pyplot?
29. Given the following set of data:
Weight measurements for 14 values of muffins (in grams)
78, 72, 69, 81, 63, 67, 65
79, 74, 71, 83, 71, 79, 80
(a) Create a simple histogram from the above data.
(b) Create a horizontal histogram from the above data.
(c) Create a step type of histogram from the above data.
(d) Create a cumulative histogram from the above data.
30. Create/draw frequency polygon from the data used in the above question.
31. From the following ordered set of data:
63, 65, 67, 69, 71, 71, 72, 74, 75, 78, 79, 79, 80, 81, 83
(a) Create a horizontal box plot
(b) Create a vertical box plot
32. Kritika was asked to write the names of a few libraries in Python used for data analysis and one method of
each. Help her write at least 3 libraries and their methods.
33. The data below represents the number of workouts each member of ABC Gym attended last month.
4, 5, 7, 7, 7, 8, 10, 11, 11, 13, 13, 14
Which Box plot correctly summarizes the data?
O
•
0 2 4 6 8 10 12 14 16
34. The following amounts (in Z) were the hourly collection from BIG MARKET, a local store, in one day in
December:
19, 26, 25, 37, 32, 28, 22, 23, 29, 34, 39 and 31.
Construct the box-and-whisker plot for the amount collected.
35. Visit data.gov.in, search for the following in "catalogs" option of the website:
• Final population totals, India and states
• State-wise literacy rate
Download them and create a CSV file containing population data and literacy rate of the respective state.
Also, add a column Region to the CSV file that should contain the values East, West, North and South. Plot
a scatter plot for each region where X-axis should be population and Y-axis should be literacy rate. Change
the marker to a diamond and size as the square root of the literacy rate. Group the data on the column
region and display a bar chart depicting average literacy rate for each region.
36. Differentiate between figure and axes.
37. What is the use of subplot() function? Write its parameters.
1. Hindustan Departmental Stores sell items of daily use such as shampoo, soap and much more.
They record the entire sale and purchase of goods month-wise so as to get a proper analysis of profit or
loss in their business transactions.
Following is the csv file containing the "Company Sales Data".
-:-1 0 I it - A A. IF 11 11 .-41 j •%
ig Eg *.- 761 .3
i ont ;.i,gn„„ne numt,,
jM
A15
B C 0 E F G H I J
-1 month_numbrfacecrearr facewash toothpast bathmgsoap shampoo motstunzt totalun t totalfirof it
1
2 1 2500 1500 5200 9200 1200 1500 21100 211000
3 2 2610 1200 5100 6100 2100 1200 18330 153300
4 3 2140 1340 4550 9550 3550 22470 224700
5 4 3400 1130 5870 8870 1870 1130 22270 222700
6 5 3600 1740 4560 7760 1560 1740 20960 209600
7 6 2760 1555 4890 7490 1890 1555 20140 201400
8 7 2980 1120 4780 89SO 1780 1340
1120 29550 295500
9 8 3704 1400 5860 9960 2860 1400 36140 361400
10 9 3540 1780 6100 8100 2100 1780 23400 234000
11 10 1990 1890 8300 10300 2300 1890 26670 266700
12 11 2340 2100 7300 13300 2400 2100 41280 412800
13 12 2900 1760 7400 14400 1800 1760 30020 300200
14
N I ► N company sales data °.0
Ready 10" k-4 i+r
Read the total profit of all months and show it using a line plot. Total profit data has been provided for each
month. Generated line plot must include the following properties:
• X label name = Month Number
• Y label name = Total profit
Ans.
prog_company_sales.py -C/UsersiprketilAFpoata E a al/Programs/Pyihon/Pytlwn3a-32/progeompany_sales.Py(3.7 —
df = pd.read_csv("D:/company_sales_data.csv")
I profitList = df [ total_profiti] .tolist 0 #converting dataframe into a list
monthList = df 'month_number' .tolist 0
plt.plot(monthList, profitList, label = 'Month-wise Profit data of last year')
;p1t.xlabel ('Month number')
Iplt.ylabel('Profit in dollar')
Iplt.xticks(monthList)
plt.title('Company profit per month')
Iplt.yticks([100000, 200000, 300000, 400000, 500000])
Iplt.show()
fro 17 Cot: 0
400000
p 300000
200000 -
100000
1 2 3 4 5 6 7 ' 8 ' 9' 10 11 12
Month number
+icki I cnI
2. NXP Labs is a leading supplier of embedded controllers with a strong legacy in both the industrial and
consumer market. It manufactures microcontrollers on a large scale and exports largely to European
nations. The format of their operations is described as:
The cost breakdown for a manufactured item, like a microcontroller, can be divided into four cost categories:
engineering (including design), manufacturing (including raw materials), sales (including marketing) and
profit.
These cost categories applied to a $9.00 microcontroller:
• Engineering $1.35
• Manufacturing $3.60
• Sales $2.25
• Profit $1.80
Develop a pie chart using matplotlib that describes the cost breakdown of building a microcontroller.
Ans.
plt.pie (sizes,
labels = labels)
plt.title('Cost breakdown of a $9 microcontroller')
plt.axis('equal')
plt.show()
Figure 1
Manufacturing
Engineering
Profit
Sales
41+1+11itk-t I ILI)