365 Data Science - SQL BASICS - SM
365 Data Science - SQL BASICS - SM
1. SQL THEORY
• Data Definition Language (DDL)
• SQL Keywords
• Data Manipulation Language (DML)
• Data Control Language (DCL)
• Transaction Control Language (TCL)
• SQL Syntax
TABLE OF CONTENTS
LET’S BREAK THINGS DOWN
sales
purchase_number
Data Definition Language
sales
purchase_number
the table name can coincide with the name assigned to the database
Data Definition Language
the ALTER statement
used when altering existing objects
- ADD
- REMOVE
- RENAME
Data Definition Language
sales
purchase_number
Data Definition Language
sales
purchase_number date_of_purchase
Data Definition Language
the DROP statement
used for deleting a database object
Data Definition Language
customers
customer_id first_name
Data Definition Language
customers
customer_id first_name
Data Definition Language
customers
customer_id first_name
Data Definition Language
customer_id first_name
Data Definition Language
customer_data
customer_id first_name
Data Definition Language
the TRUNCATE statement
instead of deleting an entire table through DROP, we can also remove its data and continue to
have the table as an object in the database
Data Definition Language
customers
customer_id first_name
Data Definition Language
alter
purchase_number
Data Definition Language
ADD
Data Definition Language
ADD
sales
purchase_number date_of_purchase
Data Definition Language
ADD, ALTER
sales
purchase_number date_of_purchase
Keywords
sales
purchase_number
Data Manipulation Language
sales
purchase_number
Data Manipulation Language
sales
purchase_number
Data Manipulation Language
sales
purchase_number
Data Manipulation Language
sales
purchase_number
Data Manipulation Language
Why are we going to need just a piece of the table?
sales
purchase_number date_of_purchase
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
the UPDATE statement
allows you to renew existing data of your tables
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
UPDATE sales
SET date_of_purchase = ‘2017-12-12’
WHERE purchase_number = 1;
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
UPDATE sales
SET date_of_purchase = ‘2017-12-12’
WHERE purchase_number = 1;
sales
purchase_number date_of_purchase
1 2017-12-12
2 2017-10-27
Data Manipulation Language
the DELETE statement
- functions similarly to the TRUNCATE statement
vs.
with DELETE, you can specify precisely what you would like to be removed
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
sales
purchase_number date_of_purchase
1 2017-10-11
2 2017-10-27
Data Manipulation Language
Data Manipulation Language (DML)
- SELECT… FROM…
- INSERT INTO… VALUES…
- UPDATE… SET… WHERE…
- DELETE FROM… WHERE…
DATA CONTROL
LANGUAGE (DCL)
Data Control Language
Data Control Language (DCL)
users
Data Control Language
The GRANT statement
gives (or grants) certain permissions to users
Data Control Language
The GRANT statement
gives (or grants) certain permissions to users
Data Control Language
The GRANT statement
gives (or grants) certain permissions to users
one can grant a specific type of permission, like complete or partial access
big companies and corporations don’t use this type of server, and their databases lay on
external, more powerful servers
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4;
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4;
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4;
Transaction Control Language
DB administrator
Problem: users
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4;
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4
COMMIT;
Transaction Control Language
DB administrator
users
Transaction Control Language
the COMMIT statement
committed states can accrue
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4
COMMIT;
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4
COMMIT;
ROLLBACK;
Transaction Control Language
DB administrator
UPDATE customers
SET last_name = ‘Johnson’
WHERE customer_id = 4
COMMIT;
ROLLBACK;
Transaction Control Language
the COMMIT statement
- saves the transaction in the database
- changes cannot be undone
- allows us to narrow the output we would like to extract from our data
OR
OR
AND
AND binds SQL to meet both conditions enlisted in the WHERE clause simultaneously
OR
conditions set on the same column
OPERATOR
PRECEDENCE
Operator Precedence
logical operator precedence
an SQL rule stating that in the execution of the query, the operator AND is applied first, while
the operator OR is applied second
AND > OR
regardless of the order in which you use these operators, SQL will always start by reading the
conditions around the AND operator
WILDCARD
CHARACTERS
Wildcard Characters
wildcard characters
% _ *
you would need a wildcard character whenever you wished to put “anything” on its place
Wildcard Characters
% - a substitute for a sequence of characters
SELECT
*
FROM
employees
WHERE
hire_date BETWEEN '1990-01-01' AND '2000-01-01';
BETWEEN… AND…
SELECT
*
FROM
employees
WHERE
hire_date BETWEEN '1990-01-01' AND '2000-01-01';
SELECT
*
FROM
employees
WHERE
hire_date NOT BETWEEN '1990-01-01' AND '2000-01-01';
BETWEEN… AND…
SELECT
*
FROM
employees
WHERE
hire_date NOT BETWEEN '1990-01-01' AND '2000-01-01';
SELECT
*
FROM
employees
WHERE
hire_date NOT BETWEEN '1990-01-01' AND '2000-01-01';
SQL
= equal to
different from
SELECT DISTINCT
SELECT DISTINCT
the SELECT statement
can retrieve rows from a designated column, given some criteria
SELECT DISTINCT
SELECT DISTINCT
selects all distinct, different data values
SELECT DISTINCT
SELECT DISTINCT
selects all distinct, different data values
aggregate functions
SUM()
sums all the non-null values in a column
MIN()
returns the minimum value from the entire list
MAX()
returns the maximum value from the entire list
AVG()
calculates the average of all non-null values belonging to a certain column of a table
Introduction to Aggregate Functions
COUNT()
counts the number of non-null records in a field
SELECT COUNT(column_name)
FROM table_name;
the parentheses after COUNT() must start right after the keyword, not after a whitespace
Introduction to Aggregate Functions
COUNT(DISTINCT )
aggregate functions
- GROUP BY must be placed immediately after the WHERE conditions, if any, and just before the
ORDER BY clause
SELECT column_name(s)
FROM table_name
WHERE conditions
GROUP BY column_name(s)
ORDER BY column_name(s);
GROUP BY
GROUP BY
in most cases, when you need an aggregate function, you must add a GROUP BY clause in
your query, too
Always include the field you have grouped your results by in the SELECT statement!
HAVING
HAVING
HAVING
refines the output from records that do not satisfy a certain condition
SELECT column_name(s)
FROM table_name
WHERE conditions
GROUP BY column_name(s)
HAVING conditions
ORDER BY column_name(s);
after HAVING, you can have a condition with an aggregate function, while WHERE
cannot use aggregate functions within its conditions
WHERE VS HAVING
WHERE vs HAVING
WHERE
allows us to set conditions that refer to subsets of individual rows
WHERE vs HAVING
WHERE vs HAVING
WHERE
WHERE vs HAVING
WHERE
re-organizing the output
into groups
(GROUP BY)
WHERE vs HAVING
WHERE
re-organizing the output
into groups
(GROUP BY)
WHERE
re-organizing the output
into groups
(GROUP BY)
HAVING
WHERE vs HAVING
WHERE
re-organizing the output
into groups
(GROUP BY)
HAVING
WHERE vs HAVING
WHERE
re-organizing the output
into groups
(GROUP BY)
HAVING
ORDER BY…
WHERE vs HAVING
HAVING
- you cannot have both an aggregated and a non-aggregated condition in the HAVING clause
WHERE vs HAVING
SELECT column_name(s)
FROM table_name
WHERE conditions
GROUP BY column_name(s)
HAVING conditions
ORDER BY column_name(s);
LIMIT
LIMIT
SELECT column_name(s)
FROM table_name
WHERE conditions
GROUP BY column_name(s)
HAVING conditions
ORDER BY column_name(s)
LIMIT number ;
THE SQL INSERT
STATEMENT
The INSERT Statement
Sales
The INSERT Statement
The INSERT Statement
COMMIT;
1
TCL’s COMMIT and ROLLBACK
COMMIT; COMMIT;
1 2
TCL’s COMMIT and ROLLBACK
COMMIT; COMMIT;
1 2 …
TCL’s COMMIT and ROLLBACK
1 2 … 10
TCL’s COMMIT and ROLLBACK
1 2 … 10 1
TCL’s COMMIT and ROLLBACK
- ROLLBACK will have an effect on the last execution you have performed
1 2 … 10 1
TCL’s COMMIT and ROLLBACK
- ROLLBACK will have an effect on the last execution you have performed
1 2 … 10 1 2
TCL’s COMMIT and ROLLBACK
- ROLLBACK will have an effect on the last execution you have performed
1 2 … 10 1 2 …
TCL’s COMMIT and ROLLBACK
- ROLLBACK will have an effect on the last execution you have performed
1 2 … 10 1 2 … 20
TCL’s COMMIT and ROLLBACK
- ROLLBACK will have an effect on the last execution you have performed
1 2 … 10 1 2 … 20
THE SQL UPDATE
STATEMENT
The UPDATE Statement
the UPDATE Statement
used to update the values of existing records in a table
The UPDATE Statement
the UPDATE Statement
used to update the values of existing records in a table
UPDATE table_name
SET column_1 = value_1, column_2 = value_2 …
WHERE conditions;
UPDATE table_name
SET column_1 = value_1, column_2 = value_2 …
WHERE conditions;
- if you don’t provide a WHERE condition, all rows of the table will be updated
THE SQL DELETE
STATEMENT
The DELETE Statement
the DELETE statement
removes records from a database
column_1
1
2
3
4
…
10
DROP vs TRUNCATE vs DELETE
DROP
column_1
1
indexes
2
3
4
+ + constraints
…
…
10
DROP vs TRUNCATE vs DELETE
DROP
column_1
1
indexes
2
3
4
+ + constraints
…
…
10
DROP vs TRUNCATE vs DELETE
DROP
- you won’t be able to roll back to its initial state, or to the last COMMIT statement
use DROP TABLE only when you are sure you aren’t going to use the table in question anymore
DROP vs TRUNCATE vs DELETE
TRUNCATE
column_1
1
2
3
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE ~ DELETE without WHERE
column_1
1
2
3
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE ~ DELETE without WHERE
column_1
1
2
3
4
+
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE ~ DELETE without WHERE
column_1
1
2
3
4
+
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1
1
2
3
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1
1
2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1
2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 11
2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 11
2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 11 1
2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 11 1
2 12
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 11 1
2 12 2
3 TRUNCATE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE
when truncating, auto-increment values will be reset
column_1 column_1
1 1
2 2
3 TRUNCATE 3
4 4
… …
10 10
DROP vs TRUNCATE vs DELETE
DELETE
removes records row by row
column_1
1
2
3
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE vs DELETE without WHERE
- auto-increment values are not reset with DELETE
column_1
1
2
3 DELETE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE vs DELETE without WHERE
- auto-increment values are not reset with DELETE
column_1 column_1
1
2
3 DELETE
4
…
10
DROP vs TRUNCATE vs DELETE
TRUNCATE vs DELETE without WHERE
- auto-increment values are not reset with DELETE
column_1 column_1
1 11
2 12
3 DELETE 13
4 14
… …
10 20