UNIT-III Lecture Notes
UNIT-III Lecture Notes
SQL - Queries, Constraints, Triggers: Overview, the form of a basic SQL query, UNION,
INTERSECT, EXCEPT, nested queries, aggregate Operators, NULL values, complex integrity
constraints in SQL, Triggers and Active Databases.
Schema Refinement and normal Forms: Introduction to schema refinement, functional
dependencies, normal forms, Properties of Decompositions, Normalizations.
Overview:
FROM from-list
WHERE qualification;
The from-list in the FROM clause is a list of table names. A table name can be followed by a
range variable; a range variable is particularly useful when the same table name appears more
than once in the from-list.
The select-list is a list of (expressions involving) column names of tables named in the from-
list. Column names can be prefixed by a range variable.
The qualification in the WHERE clause is a boolean combination (i.e., an expression using
the logical connectives AND, OR, and NOT) of conditions of the form expression op
expression, where op is one of the comparison operators {<, <=, =, <>, >=, >}.An expression
is a column name, a constant, or an (arithmetic or string) expression.
The DISTINCT keyword is optional. It indicates that the table computed as an answer to this
query should not contain duplicates, that is, two copies of the same row. The default is that
duplicates are not eliminated.
Without DISTINCT:
FROM sailors s;
With DISTINCT:
FROM sailors s;
FROM Sailors AS S
Q: Find the names of sailors „Who have reserved boat number 103‟.
SELECT S.sname
SELECT S.sname
SELECT B.color
Q: Find the names of sailors who have reserved at least one boat.
SELECT S.sname
Q: Find the ages of sailors whose name begins and ends with B and has at least three characters.
SELECT S.age
FROM Sailors S
Q:Find the names of sailors who have reserved a red or a green boat.
SELECT S.sname
UNION:
The SQL UNION clause/operator is used to combine the results of two or more SELECT
statements without returning any duplicate rows.
Q:Find the names of sailors who have reserved a red or a green boat.
SELECT S.sname
UNION
SELECT S2.sname
Q: Find all sids of sailors who have a rating of 10 or reserved boat 104.
SELECT S.sid
FROM Sailors S
WHERE S.rating = 10
UNION
SELECT R.sid
FROM Reserves R
INTERSECT:
The SQL INTERSECT clause/operator is used to combine two SELECT statements, but returns
rows only from the first SELECT statement that are identical to a row in the second SELECT
statement. This means INTERSECT returns only common rows returned by the two SELECT
statements.
Q: Find the names of sailors who have reserved both a red and a green boat.
SELECT S.sname
INTERSECT
SELECT S2.sname
EXCEPT/Set Difference:
The SQL EXCEPT clause/operator is used to combine two SELECT statements and returns rows
from the first SELECT statement that are not returned by the second SELECT statement. This
means EXCEPT returns only rows, which are not available in the second SELECT statement.
Q:Find the sids of all sailors who have reserved red boats but not green boats
SELECT S.sid
EXCEPT
SELECT S2.sid
EXCEPT
SELECT R2.sid
Nested queries:
A Subquery or Inner query or a Nested query is a query within another SQL query and
embedded within the WHERE clause.
A subquery is used to return data that will be used in the main query as a condition to further
restrict the data to be retrieved.
Subqueries can be used with the SELECT, INSERT, UPDATE, and DELETE statements
along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.
SELECT S.sname
FROM Sailors S
WHERE S.sid
IN ( SELECT R.sid
FROM Reserves R
SELECT S.sname
FROM Sailors S
WHERE S.sid
IN ( SELECT R.sid
FROM Reserves R
IN (SELECT B.bid
FROM Boats B
Q:Find the names of sailors who have not reserved a red boat.
SELECT S.sname
FROM Sailors S
WHERE S.sid
FROM Reserves R
WHERE R.bid
IN ( SELECT B.bid
FROM Boats B
Q: Find the names of sailors who have reserved boat number 103.
SELECT S.sname
FROM Sailors S
WHERE
EXISTS ( SELECT *
FROM Reserves R
Q: Find sailors whose rating is better than some sailor called Horatio.
SELECT S.sid
FROM Sailors S
FROM Sailors S2
Q: Find sailors whose rating is better than every sailor called Horatio.
SELECT S.sid
FROM Sailors S
FROM Sailors S2
SELECT S.sid
FROM Sailors S
FROM Sailors S2 )
Note that IN and NOT IN are equivalent to = ANY and <> ALL, respectively.
Aggregate operators:
SQL supports five aggregate operators, which can be applied on any column.
FROM Sailors S
FROM Sailors S
FROM Sailors S
WHERE S.age =
FROM Sailors S2 )
SELECT COUNT(*)
FROM sailors;
FROM Sailors S;
The general form of an SQL query with GROUP BY and HAVING clauses is:
FROM from-list
WHERE qualification
GROUP BY grouping-list
HAVING group-qualification;
Q: Find the age of the youngest sailor for each rating level.
Q: Find the age of the youngest sailor who is eligible to vote (i.e., is at least 18 years old) for
each rating level with at least two such sailors.
SELECT S.rating, MIN (S.age) AS minage
FROM Sailors S
WHERE S.age >= 18
GROUP BY S.rating
HAVING COUNT (*) > 1;
Q: Find the average age of sailors for each rating level that has at least two sailors.
FROM Sailors S
GROUP BY S.rating
NULL values:
Here, NOT NULL signifies that column should always accept an explicit value of the given
data type.
A field with a NULL value is the one that has been left blank during the record creation.
IS NULL operator:
Create TABLE Sailors (sid INTEGER, sname CHAR(10), rating INTEGER, age INTEGER,
PRIMARY KEY (sid), CHECK (rating >= 1 AND rating <=10));
Create TABLE Reserves (sid INTEGER, bid INTEGER, day DATE, FOREIGN KEY (sid)
REFERENCES Sailors FOREIGN KEY (bid) REFERENCES Boats, CONSTRAINT
noInterLakeRes CHECK (Interlake <> (SELECT B.bname FROM Boats B WHERE B.bid =
Reserves.bid)));
Domain Constraints:
CREATE DOMAIN ratingval INTEGER DEFAULT 1 CHECK (VALUE >= 1 AND VALUE <=10);
Triggers:
Active Databases:
Benefits of Triggers:
Creating Triggers:
The syntax for creating a trigger is-
Where,
The following program creates a row-level trigger for the customers table that would fire for
INSERT or UPDATE or DELETE operations performed on the CUSTOMERS table. This
trigger will display the salary difference between the old values and new values –
OLD and NEW references are not available for table-level triggers, rather you can use
them for record-level triggers.
INSERT Operation:
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
When a record is created in the CUSTOMERS table, the above create trigger,
display_salary_changes will be fired and it will display the following result.
Old salary:
New salary: 7500
Salary difference:
Because this is a new record, old salary is not available and the above result comes as null
UPDATE operation:
UPDATE customers
SET salary = salary + 500
WHERE id = 2;
When a record is updated in the CUSTOMERS table, the above create trigger,
display_salary_changes will be fired and it will display the following result
Old salary: 1500
New salary: 2000
Salary difference: 500
There are three types of anomalies that occur when the database is not normalized. These are –
Insertion, update and deletion anomaly.
Update Anomaly: If one copy of such repeated data is updated, an inconsistency is created
unless all copies are similarly updated.
Insertion Anomaly: It may not be possible to store certain information unless some other,
unrelated, information is stored as well.
Deletion Anomaly: It may not be possible to delete certain information without losing some
other, unrelated, information as well.
Update anomaly: In the above table we have two rows for employee Rick as he belongs to two
departments of the company. If we want to update the address of Rick then we have to update the
same in two rows or the data will become inconsistent. If somehow, the correct address gets
updated in one department but not in other then as per the database, Rick would be having two
different addresses, which is not correct and would lead to inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert the data into the
table if emp_dept field doesn‟t allow nulls.
Delete anomaly: Suppose, if at a point of time the company closes the department D890 then
deleting the rows that are having emp_dept as D890 would also delete the information of
employee Maggie since she is assigned only to this department.
Functional dependencies:
The attributes of a table is said to be dependent on each other when an attribute of a table
uniquely identifies another attribute of the same table.
Ex: Stu_id attribute uniquely identifies the Stu_Name attribute of student table because if
we know the student id we can tell the student name associated with it.
This is known as functional dependency.
It can be written as Stu_Id->Stu_Name
Stu_Name is functionally dependent on Stu_Id
if we know the values of Student_Id and Student_Name then the value of Student_Id can
be uniquely determined.
Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies
too
If a functional dependency X->Y holds true where Y is not a subset of X then this
dependency is called non trivial Functional dependency.
If a FD X->Y holds true where X intersection Y is null then this dependency is said to be
completely non trivial function dependency.
Multivalued dependency occurs when there are more than one independent multivalued
attributes in a table.
Consider a bike manufacture company, which produces two colors (Black and Red) in
each model every year.
Here columns manuf_year and color are independent of each other and dependent on
bike_model.
In this case these two columns are said to be multivalued dependent on bike_model.
X->Y
Y->Z
{Book} ->{Author} (if we know the book, we knows the author name)
{Book} -> {Author_age} should hold, that makes sense because if we know the book
name we can know the author‟s age.
Ex:
Two employees (Jon & Lester) are having two mobile numbers so the company stored
them in the same field as you can see in the table above.
This table is not in 1NF as the rule says “each attribute of a table must have atomic
(single) values”, the emp_mobile values for employees Jon & Lester violates that rule.
To make the table complies with 1NF we should have the data like this:
An attribute that is not part of any candidate key is known as non-prime attribute.
Example: Suppose a school wants to store the data of teachers and the subjects they teach. They
create a table that looks like this: Since a teacher can teach more than one subjects, the table can
have multiple rows for a same teacher.
The table is in 1 NF because each attribute has atomic values. However, it is not in 2NF because
non prime attribute teacher_age is dependent on teacher_id alone which is a proper subset of
candidate key. This violates the rule for 2NF as the rule says “no non-prime attribute is
dependent on the proper subset of any candidate key of the table”.
To make the table complies with 2NF we can break it in two tables like this:
teacher_id teacher_age
111 38
222 38
333 40
teacher_subject table:
teacher_id subject
111 Maths
111 Physics
222 Biology
333 Physics
333 Chemistry
An attribute that is not part of any candidate key is known as non-prime attribute.
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for
each functional dependency X-> Y at least one of the following conditions hold:
• X is a super key of table
• Y is a prime attribute of table
An attribute that is a part of one of the candidate keys is known as prime attribute.
Example: Suppose a company wants to store the complete address of each employee,
they create a table named employee_details that looks like this:
Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on
emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.
emp_id ----->emp_zip
emp_zip is not----->emp_id
emp_zip ----->{emp_state,emp_city,emp_district}
emp_id ---->{emp_state,emp_city,emp_district}
To make this table complies with 3NF we have to break the table into two tables to remove the
transitive dependency:
employee table:
employee_zip table:
A table complies with BCNF if it is in 3NF and for every functional dependency X->Y,
X should be the super key of the table.
Example: Suppose there is a company wherein employees work in more than one
department. They store the data like this:
The table is not in BCNF as neither emp_id nor emp_dept alone are keys.
To make the table comply with BCNF we can break the table in three tables like this:
emp_nationality table:
emp_id emp_nationality
1001 Austrian
1002 American
emp_dept_mapping table:
emp_id emp_dept
1001 Production and planning
1001 stores
1002 design and technical support
1002 Purchasing department
Functional dependencies:
emp_id -> emp_nationality
emp_dept -> {dept_type, dept_no_of_emp}
Candidate keys:
For first table: emp_id
For second table: emp_dept
For third table: {emp_id, emp_dept}
This is now in BCNF as in both the functional dependencies left side part is a key.
A super key is a set of one or more attributes (columns), which can uniquely identify a row in a
table.
Candidate keys are selected from the set of super keys, the only thing we take care while
selecting candidate key is: It should not have any redundant attribute. That‟s the reason they are
also termed as minimal super key.
Table: Employee
The above table has following super keys. All of the following sets of super key are able to
uniquely identify a row of the employee table.
{Emp_SSN} 1
{Emp_Number} 1
{Emp_SSN, Emp_Number} 2
{Emp_SSN, Emp_Name} 2
{Emp_SSN, Emp_Number, Emp_Name} 3
{Emp_Number, Emp_Name}2
A candidate key is a minimal super key with no redundant attributes. The following two set of
super keys are chosen from the above sets as there are no redundant attributes in these sets.
{Emp_SSN}
{Emp_Number}
A Primary key is selected from a set of candidate keys. This is done by database admin or
database designer. We can say that either {Emp_SSN} or {Emp_Number} can be chosen as a
primary key for the table Employee.
A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
For a dependency A → B, if for a single value of A, multiple values of B exists, then the
relation will be a multi-valued dependency.
Ex:
STUDENT
In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and
Math and two hobbies, Dancing and Singing. So there is a Multi-valued dependency on
STU_ID, which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Properties of Decomposition:
When a relation in the relational model is not in appropriate normal form then the
decomposition of a relation is required.
In a database, it breaks the table into multiple tables.
If the relation has no proper decomposition, then it may lead to problems like loss of
information.
Decomposition is used to eliminate some of the problems of bad design like anomalies,
inconsistencies, and redundancy.
If the information is not lost from the relation that is decomposed, then the decomposition
will be lossless.
The lossless decomposition guarantees that the join of relations will result in the same
relation as it was decomposed.
The relation is said to be lossless decomposition if natural joins of all the decomposition
give the original relation.
Example:
EMPLOYEE_DEPARTMENT table:
The above relation is decomposed into two relations EMPLOYEE and DEPARTMENT
EMPLOYEE table:
Now, when these two relations are joined on the common column "EMP_ID", then the resultant
relation will look like:
Employee ⋈ Department
A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should
be lossless.
Example:
In the above table, John takes both Computer and Math class for Semester 1 but he
doesn't take Math class for Semester 2. In this case, combination of all these fields
required to identify a valid data.
Suppose we add a new Semester as Semester 3 but do not know about the subject and
who will be taking that subject so we leave Lecturer and Subject as NULL. But all three
columns together acts as a primary key, so we can't leave other two columns blank.
So to make the above table into 5NF, we can decompose it into three relations R1, R2 &
R3:
R1
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
R2
SUBJECT LECTURER
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 1 John
Semester 2 Akash
Semester 1 Praveen
*****