SQL Project
SQL Project
ON
IPL AUCTION
Creating tables
CREATE TABLE ipl_ball(
id bigint, inning int,
over int, ball int,
batsman varchar, non_striker varchar,
bowler varchar, batsman_runs int,
extra_runs int, total_runs int,
is_wicket int, dismissal_kind varchar,
player_dismissed varchar, fielder varchar,
extras_type varchar, batting_team varchar,
bowling_team varchar);
Note: strike rate is total runs scored by batsman divided by number of balls faced extras_type is 'wides' it is not
counted as a ball faced neither counted as batsmen runs and players who have played more than 500 balls are
agg
Criteria: extras_type is 'wides' it is not counted as a ball faced neither counted as batsmen runs and players who
have played more than 500 balls are taken
TABLE SHOWING PLAYERS WITH HIGHEST
BATTING STRIKE RATES
batsman total_runs_scored balls_faced strike_rate
AD Russell 1517 832 182.3317
SP Narine 892 543 164.2726
HH Pandya 1349 847 159.268
V Sehwag 2728 1755 155.4416
GJ Maxwell 1505 973 154.6763
RR Pant 2079 1368 151.9737
AB de Villiers 4849 3192 151.911
CH Gayle 4772 3179 150.1101
KA Pollard 3023 2017 149.8761
JC Buttler 1714 1146 149.5637
CHART SHOWING NUMBER OF RUNS SCORED AND BALLS FACED BY EACH BATSMAN
CHART SHOWING STRIKE RATE OF EACH BATSMAN
PLAYERS WITH GOOD BATTING AVERAGE
SELECT * FROM
(SELECT [Link],SUM(a.batsman_runs) AS total_runs_scored,SUM(a.is_wicket) AS dismissed,
(SUM(a.batsman_runs)*1.0/SUM(a.is_wicket)*1.0) AS Average,COUNT(DISTINCT(EXTRACT(YEAR FROM [Link])))
AS seasons
FROM ipl_ball AS a
LEFT JOIN
ipl_matches AS b ON [Link]=[Link]
GROUP BY [Link]
ORDER BY Average DESC) AS c
WHERE [Link]>2
LIMIT 10;
Note: Average is calculated as total runs scored divided by number of times batsman has been dismissed and those players
who have not been dismissed once are excluded
Criteria: players who have played more than two seasons are taken
TABLE SHOWING PLAYERS WITH GOOD
BATTING AVERAGE
batsman total_runs_scored dismissed average seasons
Iqbal Abdulla 88 1 88 8
KL Rahul 2647 62 42.69355 7
AB de Villiers 4849 114 42.53509 13
DA Warner 5254 126 41.69841 11
JP Duminy 2029 49 41.40816 8
CH Gayle 4772 116 41.13793 12
ML Hayden 1107 27 41 3
LMP Simmons 1079 27 39.96296 4
KS Williamson 1619 41 39.4878 6
OA Shah 506 13 38.92308 4
CHART SHOWING AVERAGE OF EACH BATSMAN
100
90 88
80
70
60
Average
50
42.6935483870967
42.5350877192982
41.6984126984126
41.4081632653061
41.1379310344827 41 39.9629629629629
39.4878048780487
40 38.9230769230769
30
20
10
0
Iqbal Abdulla KL Rahul AB de Villiers DA Warner JP Duminy CH Gayle ML Hayden LMP Simmons KS Williamson OA Shah
Batsman
HARD HITTING PLAYERS
SELECT * FROM
(SELECT [Link],
SUM(a.batsman_runs) AS total_runs_scored,
SUM(CASE WHEN a.batsman_runs = 4 THEN 1 ELSE 0 END) AS fours,
SUM(CASE WHEN a.batsman_runs = 6 THEN 1 ELSE 0 END) AS sixes,
(SUM(CASE WHEN a.batsman_runs = 4 THEN 1 ELSE 0 END) +
SUM(CASE WHEN a.batsman_runs = 6 THEN 1 ELSE 0 END)) AS boundaries,
SUM(CASE WHEN a.batsman_runs=4 THEN 4 WHEN a.batsman_runs=6 THEN 6 ELSE 0 END) AS runs_by_boundaries,
(SUM(CASE WHEN a.batsman_runs = 4 THEN 4 WHEN a.batsman_runs =6 THEN 6 ELSE 0 END) * 1.0 / SUM(a.batsman_runs)) *100 AS
boundary_percentage,
COUNT(DISTINCT(EXTRACT(YEAR FROM [Link]))) AS seasons
FROM ipl_ball AS a
LEFT JOIN
ipl_matches AS b ON [Link] = [Link]
GROUP BY [Link]
ORDER BY boundary_percentage DESC) AS c
WHERE
[Link] > 2
LIMIT 10;
Note: It is calculated by boundary percentag ewhich will be runs in boundary divided by total runs scored
Criteria: Players who scored most boundaries and have played more than 2 seasons are taken
TABLE SHOWING HARD HITTING PLAYERS WITH MAXIMUM
BOUNDARIES
Boundaries
450
400 384
350
300
250 239
200
150
103 105
100 84
50
10 15 6
5 2
0
rin
e ell le
ait
e
riy
a ng ha
n ist an n y
ss ay h r
hm Go
Na Ru G hw su utti a g lc
SP CH t
ay
a JC en Gi Ra S
AD Br
a C l
AC r M
TJ B c C U
CR S JM ee
b
M u j
M
fours sixes
CHART SHOWING BOUNDARY PERCENTAGE OF EACH BATSMAN
Boundary Percentage
82
81.1659192825112
80
78.7079762689518
78
Boundary_percentage
76.0687342833193
76
75.1381215469613
74.21875
74
73.1092436974789
72.9411764705882
72.8854519091348
72.7272727272727
72.7272727272727
72
70
68
SP Narine AD Russell CH Gayle CR Brathwaite ST Jayasuriya BCJ Cutting MJ McClenaghan AC Gilchrist Mujeeb Ur Rahman MS Gony
Batsman
BOWLERS WITH GOOD ECONOMY
SELECT bowler,
SUM(total_runs) AS runs_conceded,
ROUND(COUNT(ball)/6 + MOD(CAST(COUNT(ball) as decimal),6)/10,1) AS overs_bowled,
(SUM(total_runs)*1.0)/(COUNT(ball)/6.0) AS economy
FROM
ipl_ball
GROUP BY
bowler
HAVING
COUNT(ball)>500
ORDER BY
economy
LIMIT 10;
Note: Economy is calculated by dividing total runs conceded with total overs bowled
Criteria:Bowlers who have bowled more than 500 balls are taken
TABLE SHOWING BOWLERS WITH GOOD ECONOMY
6.99148211243611
7
6.92242595204513
6.89090909090909
6.83312101910828
6.81586402266288
6.8 6.7736699729486
6.76977152899824
6.67723525681674
Economy
6.646998982706
6.6
6.4
6.33422818791946
6.2
6
Rashid Khan A Kumble M Muralitharan DW Steyn R Ashwin SP Narine DL Vettori Washington J Botha R Tewatia
Sundar
Bowler
BOWLERS WITH GOOD BOWLING STRIKE RATE
SELECT bowler,
COUNT(ball) AS balls_bowled,
SUM(is_wicket) AS total_wicket_taken,
CAST((COUNT(ball)::decimal / NULLIF(SUM(is_wicket), 0)) AS decimal(10,2)) AS strike_rate
FROM ipl_ball
WHERE dismissal_kind NOT IN('run out','retired hurt','obstucting the field')
GROUP BY bowler
HAVING COUNT(ball) >= 500
ORDER BY strike_rate DESC
LIMIT 10;
• Note: strike rate of a bowler can be calculated by number of balls bowled divided by total wickets taken
• Criteria:Bowlers who have bowled more than 500 balls are taken
TABLE SHOWING BOWLERS WITH GOOD STRIKE
RATES
bowler balls_bowled total_wicket_taken strike_rate
M Kartik 1174 31 37.87
SK Raina 925 25 37
35 34.41
33.07
32.39
30.48 30.09 29.89 29.85
30
25
Strike Rate
20
15
10
0
M Kartik SK Raina B Lee NA Saini TG Southee CH Gayle JP Duminy S Nadeem M Prasidh Krishna AD Mathews
Bowler
ALL ROUNDERS
WITH BattingStats AS (
SELECT batsman,
COUNT(ball) AS balls_faced,
SUM(batsman_runs) AS total_runs
FROM ipl_ball
WHERE extras_type!='wides'
GROUP BY batsman
),
BowlingStats AS (
SELECT bowler,
COUNT(ball) AS total_balls_bowled,
SUM(is_wicket) AS wickets_taken
FROM ipl_ball
WHERE dismissal_kind not in('run out','retired hurt','obstructing the field')
GROUP BY bowler
),
AllRounders AS (
SELECT
[Link] AS allrounder,
b.total_runs AS batting_runs,
b.balls_faced,
bl.wickets_taken AS total_wickets_taken,
bl.total_balls_bowled AS balls_bowled
FROM BattingStats b
JOIN BowlingStats bl ON [Link] = [Link])
SELECT allrounder,
(batting_runs * 100.0 / balls_faced) AS batting_strike_rate,
CAST((balls_bowled::decimal / NULLIF(total_wickets_taken, 0)) AS decimal(10,2)) AS bowling_strike_rate,
balls_faced,
balls_bowled
FROM AllRounders
WHERE balls_faced >= 500 AND balls_bowled >= 300
ORDER BY
batting_strike_rate DESC, bowling_strike_rate ASC
LIMIT 10;
Criteria: All rounders are players with the best batting as well as bowling strike rate and who have faced at least 500 balls in IPL so far and have bowled
TABLE SHOWING ALL ROUNDERS WITH GOOD
BATTING AND BOWLING STRIKE RATE
allrounder batting_strike_rate bowling_strike_rate balls_faced balls_bowled
AD Russell 182.3317308 19.34 832 1180
3500 3440
3179
3000
2808
2500
2241
2017
2000
1796
1500 1403
1280
1180 1180
973
1000 911
832 847
702 686
557 583 600
543
500
0
AD Russell SP Narine HH Pandya GJ Maxwell CH Gayle KA Pollard YK Pathan KH Pandya JA Morkel Harbhajan Singh
balls_faced balls_bowled
CHART SHOWING BATTING AND BOWLING STRIKE RATES OF
ALL ROUNDERS
200
182.33173076923
180
164.27255985267
159.26800472255
160 154.676258992805
150.110097514941 149.876053544868
142.9718875502 142.450142450142 141.982507288629
138.166666666666
140
120
100
80
60
40 32.39
29.32 28.1 27.83
22.11 21.69 23.38 21.13 22.93
19.34
20
0
AD Russell SP Narine HH Pandya GJ Maxwell CH Gayle KA Pollard YK Pathan KH Pandya JA Morkel Harbhajan
Singh
batting_strike_rate bowling_strike_rate
WICKETKEEPERS
If we are provided with a list of wicketkeepers and we need to choose two best wicketkeepers out of
them then the criteria would be that the players must have a good batting strike rate as well as they
must have hit good number of boundaries because wicketkeepers contribute more runs to the team in
the middle overs. The player must have faced atleast 400 balls so far on IPL and must have played
more than two IPL seasons.
Also the player must not have bad record of missing catches .
From the wicketkeepers list we can select 2 wicketkeepers by considering their both strike rate and
number of boundaries.
ADDITIONAL QUESTIONS
1. Get the count of cities that have hosted an IPL match
City_count
33
Q2. Create table deliveries_v02 with all the columns of the table ‘deliveries’ and an
additional column ball_result containing values boundary, dot or other depending
on the total_run.
(boundary for >= 4, dot for 0 and other for any other number)
number_of_ number_of_
boundaries dots 30870
30870 77637
77637
number_of_boundaries number_of_dots
Q4. Write a query to fetch the total number of boundaries scored by each team from the deliveries_v02 table and
order it in descending order of the number of boundaries scored.
SELECT dismissal_kind,
SUM(is_wicket) AS total_dismissals
FROM deliveries_v02
WHERE
dismissal_kind!='NA'
GROUP BY
dismissal_kind;
dismissal_kind total_dismissals
bowled 1700
caught 5743
hit wicket 12
lbw 571
retired hurt 11
stumped 294
CHART SHOWING THE TOTAL NUMBER OF DISMISSALS BY DISMISSAL KINDS
total_dismissals
stumped 294
retired hurt 11
lbw 571
hit wicket 12
caught 5743
bowled 1700
bowler extra_runs
SELECT bowler,
SL Malinga 293
SUM(extra_runs) AS extra_runs
FROM ipl_ball P Kumar 236
GROUP BY bowler UT Yadav 226
ORDER BY SUM(extra_runs) DESC
DJ Bravo 210
LIMIT 5;
B Kumar 201
CHART SHOWING EXTRA RUNS CONCEDED BY BOWLERS
350
300 293
250 236
226
210
201
200
Extra Runs
150
100
50
0
SL Malinga P Kumar UT Yadav DJ Bravo B Kumar
Bowler
8. Write a query to create a table named deliveries_v03 with all the columns of deliveries_v02 table
and two additional column (named venue and match_date) of venue and date from table matches.
(SELECT a.*,[Link],[Link]
FROM deliveries_v02 AS a
SELECT venue,
SUM(total_runs) AS total_runs_scored
FROM deliveries_v03
GROUP BY venue
SUM(total_runs) AS year_wise_total_runs
FROM deliveries_v03
year year_wise_total_runs
2018 2885
2019 2651
2015 2386
2013 2304
2017 2194
2010 2167
2016 2073
2012 2012
2011 1854
2008 1843
2014 1289
CHART SHOWING YEAR WISE RUNS AT EDEN GARDENS
3500
3000 2885
2651
2500 2386
2304
2194 2167
Total Runs
2073
2012
2000
1854 1843
1500
1289
1000
500
0
2018 2019 2015 2013 2017 2010 2016 2012 2011 2008 2014
Year