SQL Complete Notes
Author: Bhagyalakshmi C
Data Enthusiast | MSc Data Science & Analytics
1. What is a database?
- Any collection of related information that can be stored in different ways.
- E.g., Phonebook, shopping list, to-do List, your 5 best friends.
2. Database Management System (DBMS): Is a software application that
creates, maintains, updates, and deletes information from the actual database.
- A special software that helps users create and maintain a database.
- Makes it easy to manage large amounts of information.
- Handles security
- Backups
- Importing/exporting data
3. The four main operations that we can do in a database:
C. R. U. D
Create, read/retrieve, update, delete
4. Two types of Databases:
a. Relational Database (SQL):
- Organise data in one or more tables.
- Each table has rows & columns.
- A unique key identifies each row.
- MYSQL, Oracle, Postgresql
b. Non-relational database (NoSQL):
- Organising data is anything but a traditional table.
- Documents(Jason, XML,etc)
- Graphs
- Flexible tables
- MongoDB, Dynamo DB, Apache cassandra.
5. Database Queries:
- Queries are requests made to the DBMS for specific information.
- As the database structure becomes increasingly complex, it becomes more
challenging to retrieve the specific pieces of information we want.
- E.g., A google search is a query.
Core concepts of RDS:
1. Primary key: Unique identifier, if all the information is the same.
2. Foreign Key: Is a primary key inside another table.
- Allows us to link up or define relationships between tables.
3. When to Use Composite Keys?
✅ When a single column is not enough to ensure uniqueness.
✅ When managing many-to-many relationships.
✅ When business logic requires multiple factors to identify uniqueness.
SQL Basics:
- RDBMS: It's a software application that can be used to create and maintain a
relational database.
- SQL is a language. Used for interacting with RDBMS.
- You can use SQL to get the RDBMS to do things for you.
- Create, retrieve, update and delete data
- Create and manage the database.
- Design & create database tables.
- Perform administration tasks(security, import/export, etc).
SQL is a hybrid language; it's 4 types of languages in one, mashed into one
language.
1. Data Query Language (DQL)
2. Data Definition Language (DDL)
3. Data Control Language (DCL)
4. Data Manipulation Language (DML)
What’s Schema?
- Schemas help group related tables together, making it easier to manage, secure them.
- Overall layout of the Database.
- Eg: What tables are going to be in the db, what columns those tables are going to
have, and the data types that those columns are going to be able to store.
- We can use SQL to define data in different database schemas.
Queries:
- A set of instructions given to the RDBMS ( written in SQL) that tell the RDBMS what
information you want it to retrieve for you.
- Tons of data in the DB.
- Often hidden in a complex schema.
- The goal is to only get the data you need.
Core SQL Data Types:
1. INT - Whole number
2. Decimal (M, N): M- Whole number. N- How many decimals
3. VARCHAR(1) - String of text length 1
4. BLOB - binary data, stores large data
5. Date - yyyy-mm-dd
6. Timestamp - yyyy-mm-dd HH:MM:SS;
1st step:
1. Creating a table:
CREATE TABLE student(
student_id INT PRIMARY KEY,
name VARCHAR(20),
major VARCHAR(20)
);
2. Describe the table:
DESCRIBE TABLE_NAME;
3. Delete:
DELETE TABLE_NAME;
4. Modify:
ALTER TABLE_student ADD gpa DECIMAL(3,2);
5. Drop the added column:
ALTER TABLE student DROP column GPA;
II. Inserting Data into a DB Table:
INSERT INTO student VALUES (1, ‘Jack’, ‘Sociology’)
Note: If I don't want the major in the values section, then we can modify it
like this:
INSERT INTO student(student_id, name) VALUES(2, ‘Jack’)
KEYWORDS:
1. NOT NULL- Some values can't be null, like student_name
2. UNIQUE: unique identifier
3. UNDECIDED
4. AUTO_INCREMENT: Automatically adds the value instead of manually saying 1,2,3..
III. Update & Delete:
For eg, I want the major to be Bio instead of biology:
SELECT * FROM student;
UPDATE student
SET major = ‘Bio’
WHERE major = ‘Biology’;
Other comparison ops:
= : equals
< > : not equals
DELETE:
DELETE FROM STUDENT
DELETE student
WHERE name = ‘Tom’ AND major = ‘undecided’;
IV. Basic Queries: Getting particular information from the database where
the conditions meet, if it's a huge amount of information.
ORDER BY:
SELECT * FROM student;
ORDER BY student_id DESC / ASC;
LIMIT: It limits the tables.
V. FILTERING:
Filtering based on the condition or criteria.
SELECT * FROM student
WHERE student_id < 3 AND Name <> ‘Jack’ ;
IN KEYWORD:
SELECT * FROM student
WHERE major IN (‘Biology’, ‘Chemistry’) AND student_id < 2;
Advanced topics
I. Creating a company database:
CREATE TABLE employee (
emp_id INT PRIMARY KEY,
first_name VARCHAR(40),
last_name VARCHAR(40),
birth_day DATE,
sex VARCHAR(1),
salary INT,
super_id INT,
branch_id INT
);
Creating a branch table:
CREATE TABLE branch (
branch_id INT PRIMARY KEY,
branch_name VARCHAR(40),
mgr_id INT,
mgr_start_date DATE,
FOREIGN KEY(mgr_id) REFERENCES employee(emp_id) ON DELETE SET
NULL
);
More Basic Queries:
1. Find all the employees
SELECT * FROM employee;
2. Find all the clients
SELECT * FROM client;
3. Find all employees ordered by salary
SELECT * FROM employee
ORDER BY salary DESC/ASC;
4. Find all employees ordered by sex, then name
SELECT * FROM employee
ORDER BY sex, first_name, last_name;
5. Find the first 5 employees in the table
SELECT * FROM employee
LIMIT 5;
6. Find the first & last names of all employees
SELECT first_name, last_name FROM employee
7. Find the forename and surname names from all employees
SELECT first_name AS forename, last_name AS surname
FROM employee;
8. Find all the genders in the employee table
SELECT DISTINCT sex
FROM employee;
------------ DISTINCT: To know how many categories there are.
II. FUNCTIONS:
SQL functions are predefined operations that help manipulate data, perform
calculations, and return specific results. They make SQL queries faster, cleaner, and
more efficient.
I COUNT: How many no. of values are there in a table.
E.g.: Find the number of employees
SELECT COUNT(emp_id)
FROM employee;
2. Find the number of female employees born after 1970
SELECT COUNT (emp_id)
FROM employee
WHERE sex = ‘F’ AND birth_date > ‘1970-01-01’ ;
II. AVERAGE: finds all the average values in the DB.
E.g.: Find the average salary of all the male employees
SELECT AVG(salary)
From employee
WHERE sex = ‘M’;
III. SUM: Adds up all the values in the columns
E.g.: Find the sum of the salary from the employee table
SELECT SUM(salary)
FROM employee;
✅
III. AGGREGATION:
✅
Summarises large datasets quickly.
✅
Makes reports & analytics easier.
Finds insights in grouped data (like total sales per region).
1. Select how many males and females there are
SELECT COUNT(sex), sex
From employee
GROUP BY sex;
2. Find the total sales of each salesman
SELECT sum(total_sales), emp_id
FROM works_with
GROUP BY emp_id;
IV. WILDCARDS:
Grab data that matches a specific pattern.
The LIKE keyword is used in SQL to search for patterns in text data.
🔹 Think of it like searching for files on your computer using wildcards ( * or ?).
How Does It Work?
● % (Percentage Sign) → Matches any number of characters.
● _ (Underscore) → Matches exactly one character.
E.g.: Find any clients who are in LLC
SELECT * FROM client
WHERE client_name LIKE ‘%LLC’ ;
2. Find any employee born in October
SELECT * FROM employee
WHERE birth_data LIKE ‘____10%’;
V. UNION OPERATOR:
- Which combines the multiple columns into one column.
- Both columns should have the same columns while doing a union.
- The same datatype should be there in two columns.
E.g.: Find a list of employees and branch names
SELECT first_name
FROM employee
UNION
SELECT branch_name
FROM branch;
2. Find the list of all money spent or earned by the company
SELECT salary
FROM employee
UNION
SELECT total_sales
FROM works_with;
VI. JOINS:
Note: Full Outer Join can't be used in MYSQL.
VI. Nested Queries:
A query using multiple select statements to get specific information.
E.g.: Find the names of all employees who have sold over 30,000 to a single client
SELECT employee.first_name, employee.last_name
FROM employee
WHERE employee.emp_id IN (
SELECT works_with.emp_id
FROM works_with
WHERE works_with.total_sales > 30000
);
VII. ON DELETE [ When they have associated with foreign keys ]:
1. ON DELETE SET NULL: If we delete one of these employees, that means that the
manager_id that was associated with that employee is going to get set to NULL.
2. ON DELETE CASCADE: If we delete the employee whose ID is stored in the
manager ID column, then we are supposed to delete the entire row in the database.
NOTE: Primary key cannot be NULL.
VII. Triggers:
A trigger in SQL is like a hidden assistant that automatically runs when something
happens in a database. It helps enforce rules, log changes, or automate tasks without
manual intervention.
E.g.: Think of a trigger like an alarm system—when someone enters a restricted area
(event), the alarm goes off (trigger action).
How Triggers Work? A trigger activates automatically when a specific event occurs on
a table. These events include:
✅ INSERT – When a new row is added
✅ UPDATE – When an existing row is changed
✅ DELETE – When a row is removed
VIII. ER [Entity relationship] Diagrams Info:
A sort of great way to take business requirements and convert them into actual
database schema.
Advantages:
- Map out the different relationships.
- Great way to organise data into a database schema.
E.g.: Student DB.
1. Entity: An object we want to model/store information about.
2. Attributes: Specific pieces of information about an entity.
3. Primary key: Unique identifier.
4. Composite Attribute: An attribute that can be broken into sub-attributes.
5. Multi-valued attribute: Having more than one value.
6. Derived attribute: Derived from other attributes.
7. Multiple entities: More than one entity in the diagram.
8. Cardinality relationship: A student can take any no. of classes for each
subject, or a class can have any number of students.
Eg: 1:1, 1:N, N:M
9. Weak entity: Which relies on another entity.
IX. Converting it into a schema:
1. Mapping of regular Entity types.
2. Mapping of weak entity types.
3. Mapping of 1:1 binary relationship types.
4. Mapping of binary 1:N relationship types.
5. Mapping of binary M: N relationship types.
Questions to expect in the interview:
1. Difference between DBMS and RDBMS
DBMS:
- Database management system.
- The basic way to store the table in the database.
- Files are stored in a way which is not connected. E.g., course, birth_date, address are
stored in a different database.
- To look at each table is very difficult & time-consuming.
RDBMS:
- Relational Database Management System.
- Have a relationship between the data.
- Eg: MYSQL.
- Can store student information into tables & columns, interconnecting each other in a
single database.
- Can perform a query to fetch specific information, which is easy and less
time-consuming.
- Faster data retrieval for a large dataset
2. Primary Key & Foreign Key
Primary key: Unique identifier for each student's details.
Foreign key: Helps connect data across tables, ensuring that records in one table can
reference related information in another.
3. What are constraints and their types?
For eg, we must have used Google Forms at least once recently, where some fields are
mandatory and some are optional. These are called constraints.
- In SQL, it is like rules set up for the data in the tables.
- Helps keep data accurate & reliable.
Constraints in SQL:
1. NOT NULL: A column can't be empty.
2. UNIQUE: No two columns are the same.
3. Primary key
4. Foreign key
5. Check: Helps setthe condition
6. Default: Setting default.
4. Explain the DDL & DML commands in SQL
1. DDL:
- Defining the structure of the DB.
- Like creating, altering & deleting.
2. DML:
- Working with the actual data.
- Inserting, updating, and deleting.
These commands help keep data up to date & organise the data.
5. What's the difference between delete, drop & truncate statements?
1. Delete: It can be deleted with specific conditions.
- If you change your mind, you can always roll back and store it again.
2. Truncate: It can be deleted completely without any specific condition.
- It does not allow you to roll back.
- It's easy & quick. Once it's done, it's done.
3. DROP: Removing it with the entire data.
- Can't be rolled back, permanently deleted.
6. What's the difference between GROUP BY and ORDER BY
GROUP BY:
- Grouping data into meaningful summaries.
- Used in aggregate functions like SUM, AVG, and COUNT.
- Total salary of each department.
- Average salary in each department
- Count of employees in each department
ORDER BY:
- It's like sorting rows in a particular order.
- Such as salaries from highest to lowest.
7. Diff between Where Clause & Having clauses?
WHERE:
- Filter Individual rows based on specific conditions like name & age.
HAVINGS:
- You use HAVING when you want to filter groups created by the Group BY clause
based on the aggregate results, like Counts or Sums.
E.g.: How many students are in each age group, but only display age groups with more
than one student
SELECT age, COUNT(Roll_No) AS No_Of_Students
FROM student
GROUP BY AGE;
HAVING COUNT(Roll_No) > 1;
8. Explain Indexing in SQL, and what do you mean by Clustered Index?
Let's say you have a book, while searching for a topic without knowing the index or
page number. It's very slow. Indexing is like organising the data itself in order.
- Makes data retrieval fast & efficient.
9. What is Normalisation & explain diff types of Normal Forms?
Normalisation in DB works similarly by organising data efficiently, minimising data
redundancy like Duplicate information. & preventing issues when inserting, deleting &
updating records.
Why is it imp?
- Reduce Redundancy: It reduces duplicates, helping avoid storing the same
information again.
- Prevents Anomalies: Helps prevent errors while adding, removing or updating data.
Types:
1. First Normal Form [1NF]: Each table cell should contain a single value, and each
column must have a unique name.
2. Second Normal Form [2NF]: All Non-Attributes must depend on the primary key.
3. 3NF: Every non-key attribute must be independent of other non-key attributes.
4. Boyce-Codd Normal Form (BCNF): Every determinant (An attribute that can
determine another attribute) must be a candidate key.
10. What is Union and Union All?
- Union: Removes duplicates
- Union All: Keeps duplicates
11. What are the views in SQL?
- Detail View Query.
12. Converting String to Date:
Str_to_date
SELECT Str_to_date(‘27-10-2024’, ‘%d-%m-%Y’)
13. What's triggering in SQL?
Triggers are like reflex actions which allow to set up an automatic action.
- Adding, updating & deleting data in a table.
When do triggers run?
- INSERT: When data is added to the table.
- Update: When the existing table changes.
- Delete: When data is removed from the table.
For eg, When a person borrows a book, it needs to be updated in this way:
1. Decrease the available stock for that book.
2. Log the transformation to keep track of borrowed books.
Area Topics
Basics SELECT, WHERE, ORDER BY, LIMIT,
DISTINCT
Joins INNER, LEFT, RIGHT, SELF JOIN, CROSS
JOIN
Aggregation GROUP BY, HAVING, COUNT, SUM, AVG,
MAX, MIN
Subqueries Scalar subqueries, IN, EXISTS, nested
SELECTs
Filtering CASE, LIKE, IN, BETWEEN, date filters
Window ROW_NUMBER(), RANK(), DENSE_RANK(),
Functions OVER(PARTITION BY)
CTEs WITH clause
Set Ops UNION, UNION ALL, INTERSECT, EXCEPT
DDL/DML CREATE, ALTER, DROP, INSERT, UPDATE,
DELETE
Misc TRUNCATE, INDEX, VIEW, constraints,
normalization, keys
A database most often contains one or more tables. Each table is
identified by a name (e.g. "Customers" or "Orders"), and contain
records (rows) with data.
SELECT column1, column2, ...
FROM table_name;
Here, column1, column2, ... are the field names of the table you want to
select data from.
The table_name represents the name of the table you want to select
data from.
If you want to return all columns, without specifying every column
name, you can use the SELECT * syntax:
Inside a table, a column often contains many duplicate values; and
sometimes you only want to list the different (distinct) values.
SELECT DISTINCT column1, column2, ...
FROM table_name;
Note: The COUNT(DISTINCT column_name) is not supported in
Microsoft Access databases.
WHERE:
SELECT column1, column2, ...
FROM table_name
WHERE condition;
ORDER BY:
SELECT column1, column2, ...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;
AND:
SELECT column1, column2, ...
FROM table_name
WHERE condition1 AND condition2 AND condition3 ...;
AND vs OR
The AND operator displays a record if all the conditions are TRUE.
The OR operator displays a record if any of the conditions are TRUE
NOT
Select only the customers that are NOT from Spain:
SELECT * FROM Customers
WHERE NOT Country = 'Spain';
NULL:
IS NULL, IS NOT NULL