0% found this document useful (0 votes)
4 views

Week 03 Part 02

The document is a tutorial on SQL and database management, covering essential concepts such as database types, ACID properties, SQL basics, normalization, and keys. It outlines prerequisites for the course and includes a case study on retail operations at Smartsense. Key topics include important SQL commands, data redundancy issues, and levels of normalization to ensure efficient database design.

Uploaded by

Riya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Week 03 Part 02

The document is a tutorial on SQL and database management, covering essential concepts such as database types, ACID properties, SQL basics, normalization, and keys. It outlines prerequisites for the course and includes a case study on retail operations at Smartsense. Key topics include important SQL commands, data redundancy issues, and levels of normalization to ensure efficient database design.

Uploaded by

Riya singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

BUSINESS INTELLIGENCE &

ANALYTICS

SQL TUTORIAL
Course Instructor: Prof Saji K Mathew
K R SUBISHA
TA
PHD SCHOLAR, INFORMATION
SYSTEMS
IIT MADRAS
PRE REQUISITES

► Learn basic SQL Queries


https://www.mysqltutorial.org/
► Installing and Using MySQL with Workbench
https://dev.mysql.com/doc/workbench/en/
https://www.mysqltutorial.org (SQL tutorial for MySQL database)
► Case: Retail operations at Smartsense
► Create a schema and Import the tables
AGENDA

► DBMS
► Intro to SQL
► Basic commands
► Keys
► Normalization
► Case: Shopsense retail
DATA BASE MANAGEMENT SYSTEM

► DATABASE: Collection of data stored in a format that can


easily be accessed (Software DBMS)
► 2 Types: RELATIONAL (MySQL) and NON- RELATIONAL
(NoSQL)
► RELATIONAL: Data stored in tables which are linked to
each other using relationships (SQL- Structured Query
language)
► NON RELATIONAL: No tables, doesn’t understand SQL
ACID PROPERTIES

1. Atomicity: Atomicity ensures that a transaction is treated as a single,


indivisible unit of work. It means that either all the changes made by a
transaction are applied, or none of them are. Example: Suppose you are
transferring money from one bank account to another. The transaction
should be atomic
2. Consistency: Enforcing consistency ensures that if a database enters into
an illegal state (if a violation of data integrity constraints occurs) the
process will be aborted and changes rolled back to their previous, legal
state. Example: In a database of student records, if a transaction tries to
insert a new record with a duplicate student ID, the DBMS should reject
the transaction to maintain consistency
ACID PROPERTIES

3. Isolation: Isolation ensures that concurrent execution of multiple


transactions does not result in interference or unexpected outcomes. Each
transaction should be isolated from others until it's completed. Example:
Consider two users booking the last available seat on a flight simultaneously.
Isolation ensures that one user's booking doesn't interfere with the other
4. Durability: Durability guarantees that once a transaction is committed, its
effects (changes to the database) are permanent and survive any subsequent
system failures, such as power outages or crashes. Example: After a
customer places an order and receives an order confirmation, the order
details should be durably stored in the database
SQL BASICS

• SQL is not a case-sensitive language.


• In MySQL, every statement must be terminated with a semicolon
• RDBMS is the basis for SQL, and for all modern database systems
such as MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft
Access.
• The data in RDBMS is stored in database objects called tables. A
table is a collection of related data entries and it consists of
columns and rows.
FIELDS AND RECORDS

► Every table is broken up into smaller entities called fields.


► ( Eg: CustomerID, CustomerName, ContactName, Address, City, PostalCode and
Country).
► A field is a column in a table that is designed to maintain specific information
about every record in the table.
► A record, also called a row, is each individual entry that exists in a table.
► A record is a horizontal entity in a table.
► A column is a vertical entity in a table that contains all information associated
with a specific field in a table.
DATABASE/SCHEMA:

► A database most often contains one or more tables.


► Each table is identified by a name (e.g. "Customers" or "Orders"). Tables contain records (rows)
with data.
DBMS KEYS

► WHY NEEDED? – For identifying any row of


data uniquely
► SUPERKEY- An attribute or a set of
attributes that can uniquely identify row of
data in table
► CANDIDATE KEY- Minimal subset of
superkey If any proper subset of a
superkey is a superkey then that key
cannot be a candidate key
► PRIMARY KEY- The candidate key which is
chosen to uniquely identify each row in a
table
► FOREIGN KEY- Attribute in a table which is used to create a relationship of that table with
another table
► Branch_code is the foreign key in student table (referential integrity)

► COMPOSITE KEY:A key with more than one attribute


► COMPOUND KEY: Composite key with at least one attribute which is FK
IMPORTANT COMMANDS
1. The SELECT statement allows you to select data from one or more tables- (SELECT select_list FROM
table_name; )
2. To sort the rows in the result set, you add the ORDER BY clause to the SELECT statement- (ORDER BY
column1 ASC; )
3. The WHERE clause allows you to specify a search condition for the rows returned by a query-(WHERE
search_condition; )
4. To test whether a value is NULL or not, you use the IS NULL operator.- (value IS NULL )
5. A JOIN is a method of linking data between one or more tables based on values of the common column
between the tables.
6. The GROUP BY clause groups a set of rows into a set of summary rows by values of columns or
expressions. The GROUP BY clause returns one row for each group. In other words, it reduces the
number of rows in the result set.
GROUP BY
NORMALISATION IN SQL

► It is the process of
designing a DB
effectively such that we
can avoid data
redundancy
► Insertion/ deletion/
Updation anomalies
can be avoided
► In some columns, same
values can be there for
multiple records- DR
DATA REDUNDANCY AND ISSUES

► INSERTION ANOMALY: To insert redundant data


for every new row. Eg: 100 students details to be
inserted 100 more repetitions
► DELETION ANOMALY: Loss of related dataset
when some other dataset is deleted eg: deletion
of student information leads to deletion of
branch information
► UPDATION ANOMALY: Say new hod name to be
updated each and every row needs to be
updated. If one row missed out then
modification anomaly.
HOW TO NORMALISE?
(into logical, independent but related data)
LEVELS OF NORMALISATION

► Database normalization is the process of decomposing relations with anomalies to


produce smaller, well structured relations
► 1st Normal form: Multivalued attributes (repeating groups)removed/re-organized
► 2nd normal form: Partial dependencies addressed
► 3rd normal form: Transitive dependencies addressed (golden standard of
normalisation)
► Boyce/Codd NF, 4th NF and higher normal forms do exist
► Trade off: Efficient storage space vs efficient data processing De-normalization
1 NF

► If your DB not even in 1NF Then poor DB design


► NO multivalued attributes
► A column should have values of same type
► Each column should have unique name
► Order in which you store data doesn’t matter
2NF

► For a table to be in the Second Normal form, it should be in the


First Normal form and it should not have Partial Dependency.
► Partial Dependency exists, when for a composite primary key, any
attribute in the table depends only on a part of the primary key and
not on the complete primary key.
► To remove Partial dependency, we can divide the table, remove
the attribute which is causing partial dependency, and move it to
some other table where it fits in well.
PRIMARY KEY AND PARTIAL DEPENDENCY

► PK- can uniquely identify each row of the


table
► Student_id + Subject_id is the PK but here
teacher’s name depends only on the
subject (This is Partial dependency)
► How to remove PD? Move teacher’s
name to subject table or create a table of
teacher with name and id 2NF
3 NF

► Should be in 2nd NF
► Should not have Transitive Dependency
► TD is when there is an attribute in a table which depends on a non prime attribute and
not on a prime attribute
► All non prime attributes should depend only on prime attributes
BCNF (BOYCE CODD NF)

► Table should be in 3 NF
► For any dependency A B,A should be a
super key (A cannot be non prime
attribute and B a prime attribute)
BCNF CONTD..

(Suppose 1 professor is teaching only 1 subject)


SHOPSENSE RETAIL CASE
QUERY AREA

PANEL VIEW
CHANGE

NAVIGATOR PANEL
SEE SCHEMAS
(sys- internal db)
EXECUTE

QUERY AREA
Create new
schema

OUTPUT
TABLES
SCHEMA

You might also like