0% found this document useful (0 votes)
9 views87 pages

Unit - III

The document provides an overview of database management systems, focusing on relational algebra and SQL queries related to sailors, boats, and reservations. It includes recommended textbooks, sample data for three tables (Sailors, Boats, Reserves), and various SQL query examples for retrieving information from these tables. Additionally, it discusses the creation of tables, data insertion, and the use of SQL commands for data manipulation and retrieval.

Uploaded by

nencydey04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views87 pages

Unit - III

The document provides an overview of database management systems, focusing on relational algebra and SQL queries related to sailors, boats, and reservations. It includes recommended textbooks, sample data for three tables (Sailors, Boats, Reserves), and various SQL query examples for retrieving information from these tables. Additionally, it discusses the creation of tables, data insertion, and the use of SQL commands for data manipulation and retrieval.

Uploaded by

nencydey04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

Database Management

Systems

Unit -II
Recommended Books:
1. Silberschatz, H. Korth and S. Sudarshan: Database System Concepts,
McGraw Hill

2. R. Ramakrishnan and J. Gehrke: Database Management Systems,


McGraw Hill

3. R. Elmasri and S. B. Navathe: Fundamentals of Database Systems,


Addison Wesley
SAILORS RESERVES
SID SNAME RATING AGE SID BID DAY
22 A 7 45 22 101 10/10/98
29 B 1 33 22 102 10/10/98
31 C 8 55 22 103 10/8/98
32 D 8 25 22 104 10/7/98
58 E 10 35 31 102 11/10/98
64 F 7 35 31 103 11/6/98
71 G 10 16 31 104 11/12/98
74 F 9 35 64 101 9/5/98
85 I 3 25 64 102 9/8/98
95 J 3 63 74 103 9/8/98
BOATS
BID BNAME COLOR
101 INTERLAKE BLUE
102 INTERLAKE RED
103 CLIPPER GREEN
104 MARINE RED
Questions – Using Relational Algebra
Expressions
1. Find the names of sailors who have reserved boat 103.
2. Find the names of sailors who have reserved a red boat.
3. Find the colors of boats reserved by C.
4. Find the names of sailors who have reserved at least one boat.
5. Find the names of sailors who have reserved a red or a green boat.
6. Find the names of sailors who have reserved a red and a green boat.
7. Find the names of sailors who have reserved at least two boats.
8. Find the sids of sailors with age over 20 who have not reserved a red
boat.
9. Find the names of sailors who have reserved all boats.
10. Find the names of sailors who have reserved all boats called Interlake.
Questions – Using Relational Algebra
Expressions
1. Find the names of sailors who have reserved boat 103.

Answer:
sname: A, C, F
Questions – Using Relational Algebra
Expressions
2. Find the names of sailors who have reserved a red boat.

This query involves a series of two joins. First we choose (tuples


describing) red boats. Then we join this set with Reserves (natural join,
with equality specified on the bid column) to identify reservations of red
boats. Next we join the resulting intermediate
relation with Sailors (natural join, with equality specified on the sid
column) to retrievethe names of sailors who have made reservations of
red boats. Finally, we project the sailors’ names.
Questions – Using Relational Algebra
Expressions
2. Find the names of sailors who have reserved a red boat.

Answer:
Sname: A, C, F
Questions – Using Relational Algebra
Expressions
3. Find the colors of boats reserved by C.

Answer:
Color: green and red
Questions – Using Relational Algebra
Expressions
4. Find the names of sailors who have reserved at least one boat.

A Sailors tuple appears in (some tuple of) this intermediate relation


only if at least one Reserves tuple has the same sid value, that is, the
sailor has made some reservation.

Answer: sname: A,C,F


Questions – Using Relational Algebra
Expressions
5. Find the names of sailors who have reserved a red or a green boat.

We identify the set of all boats that are either red or green (Tempboats, which
contains boats with the bids 102, 103, and 104 on instances B, R, and S). Then
we join with Reserves to identify sids of sailors who have reserved one of these
boats; this gives us sids 22, 31, 64, and 74 over our example instances. Finally,
we join (an intermediate relation containing this set of sids) with Sailors to find
the names of Sailors with these sids. This gives us the names A, C, and F on the
instances B, R, and S.
Questions – Using Relational Algebra
Expressions
5. Find the names of sailors who have reserved a red or a green boat.
Questions – Using Relational Algebra
Expressions
6. Find the names of sailors who have reserved a red and a green boat.

It tries to compute sailors who have reserved a boat that is both red and green.
(Since bid is a key for Boats, a boat can be only one color; this query will always
return an empty answer set.)
Questions – Using Relational Algebra
Expressions
6. Find the names of sailors who have reserved a red and a green boat.

The correct approach is to find sailors who have reserved a red boat, then
sailors who have reserved a green boat, and then take the intersection of
these two sets:

Answer: sname: A,C


Questions – Using Relational Algebra
Expressions
7. Find the names of sailors who have reserved at least two boats.

Answer: sname: A,C,F


Questions – Using Relational Algebra
Expressions
7. Find the names of sailors who have reserved at least two boats.
First we compute tuples of the form sid,sname,bid, where sailor sid has made
a reservation for boat bid; this set of tuples is the temporary relation
Reservations. Next we find all pairs of Reservations tuples where the same
sailor has made both reservations and the boats involved are distinct. Here is
the central idea: In order to show that a sailor has reserved two boats, we
must find two Reservations tuples involving the same sailor but distinct boats.
Over instances B, R, and S, the sailors with sids 22, 31, and 64 have each
reserved at least two boats. Finally, we project the names of such sailors to
obtain the answer, containing the names A, C and F.
Questions – Using Relational Algebra
Expressions
8. Find the sids of sailors with age over 20 who have not reserved a red
boat.

This query illustrates the use of the set-difference operator. Again, we use the fact
that sid is the key for Sailors. We first identify sailors aged over 20 (over instances B,
R, and S, sids 22, 29, 31, 32, 58, 64, 74, 85, and 95) and then discard those who have
reserved a red boat (sids 22, 31, and 64), to obtain the answer (sids 29, 32, 58, 74,
85, and 95). If we want to compute the names of such sailors, we must first compute
their sids (as shown above), and then join with Sailors and project the sname values.
Questions – Using Relational Algebra
Expressions
9. Find the names of sailors who have reserved all boats.

The intermediate relation Tempsids is defined using division, and computes


the set of sids of sailors who have reserved every boat (over instances B, R,
and S, this is just sid 22). Division then returns all sids such that there is a
tuple hsid,bidi in the first relation for each bid in the second. Joining Tempsids
with Sailors is necessary to associate names with the selected sids; for sailor
22, the name is A.
Questions – Using Relational Algebra
Expressions
10. Find the names of sailors who have reserved all boats called
Interlake.

Now we apply a selection to Boats, to ensure that we compute only bids of


boats named Interlake in defining the second argument to the division
operator. Over instances B, R, and S, Tempsids evaluates to sids 22 and 64,
and the answer contains their names, A and F.
Consider the following schema
Consider the following schema
Question
Tutorial 5: SQL

By Chaofa Gao

Tables used in this note:


Sailors(sid: integer, sname: string, rating: integer, age: real);
Boats(bid: integer, bname: string, color: string);
Reserves(sid: integer, bid: integer, day: date).

Sailors Reserves
Sid Sname Rating Age sid bid day
22 Dustin 7 45 Boats 22 101 1998-10-10
29 Brutus 1 33 bid bname color 22 102 1998-10-10
31 Lubber 8 55.5 101 Interlake blue 22 103 1998-10-8
32 Andy 8 25.5 102 Interlake red 22 104 1998-10-7
58 Rusty 10 35 103 Clipper green 31 102 1998-11-10
64 Horatio 7 35 104 Marine red 31 103 1998-11-6
71 Zorba 10 16 31 104 1998-11-12
74 Horatio 9 40 64 101 1998-9-5
85 Art 3 25.5 64 102 1998-9-8
95 Bob 3 63.5 74 103 1998-9-8

Figure 1: Instances of Sailors, Boats and Reserves

1. Create the Tables:

CREATE TABLE sailors ( sid integer not null,


sname varchar(32),
rating integer,
age real,
CONSTRAINT PK_sailors PRIMARY KEY (sid) );
CREATE TABLE reserves ( sid integer not null,
bid integer not null,
day datetime not null,
CONSTRAINT PK_reserves PRIMARY KEY (sid, bid, day),
FOREIGN KEY (sid) REFERENCES sailors(sid),
FOREIGN KEY (bid) REFERENCES boats(bid) );
2. Insert Data

INSERT INTO sailors


( sid, sname, rating, age )
VALUES ( 22, 'Dustin', 7, 45.0 )

INSERT INTO reserves


( sid, bid, day )
VALUES ( 22, 101, '1998-10-10')

Note the date can have one of the following formats:


yyyy-mm-dd, mm-dd-yyyy and mm/dd/yyyy
In addition, DB2 allows to parse the date attribute using its month(), year() and day() functions.
e.g. select * from reserves where year(day) = 1998 and month(day) = 10

3. Simple SQL Query

The basic form of an SQL query:


SELECT [DISTINCT] select-list
FROM from-list
WHERE qualification

Ex1: Using DISTINCT

Sname age sname age


Dustin 45 SELECT sname, age Andy 25.5
Brutus 33 FROM sailors Art 25.5
Lubber 55.5 or Bob 63.5
Andy 25.5 SELECT S.sname, S.age Brutus 33
Rusty 35 FROM sailors S Dustin 45
Horatio 35 Horatio 35
Zorba 16 Lubber 55.5
Horatio 35 SELECT DISTINCT S.sname, S.age Rusty 35
FROM sailors AS S
Art 25.5 Zorba 16
Bob 63.5

Ex2. Find all information of sailors who have reserved boat number 101.
SELECT S.*
FROM Sailors S, Reserves R
WHERE S.sid = R.sid AND R.bid = 103
Or without using the range variables, S and R
SELECT Sailors.*
FROM Sailors, Reserves
WHERE Sailors.sid = Reserves.sid AND Reserves.bid = 103

* can be used if you want to retrieve all columns.

Ex3. Find the names of sailors who have reserved a red boat, and list in the order of age.
SELECT S.sname, S.age
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
ORDER BY S.age

ORDER BY S.age [ASC] (default)


ORDER BY S.age DESC

Ex4. Find the names of sailors who have reserved at least one boat.
SELECT sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid

The join of Sailors and Reserves ensure that for each select sname, the sailor has made some
reservation.

Ex5. Find the ids and names of sailors who have reserved two different boats on the same day.
SELECT DISTINCT S.sid, S.sname
FROM Sailors S, Reserves R1, Reserves R2
WHERE S.sid = R1.sid AND S.sid = R2.sid
AND R1.day = R2.day AND R1.bid <> R2.bid

Ex6. Using Expressions and Strings in the SELECT Command.


SELECT sname, age, rating + 1 as sth
FROM Sailors
WHERE 2* rating –1 < 10 AND sname like ‘B_%b’

SQL provides for pattern matching through LIKE operator, along with the use of symbols:
% (which stands for zero or more arbitrary characters) and
_ (which stands for exactly one, arbitrary, characters)
4. Union, Intersect and Except

Note that Union, Intersect and Except can be used on only two tables that are union-compatible,
that is, have the same number of columns and the columns, taken in order, have the same types.

Ex7. Find the ids of sailors who have reserved a red boat or a green boat.
SELECT R.sid
FROM Boats B, Reserves R
WHERE R.bid = B.bid AND B.color = ‘red’
UNION
SELECT R2.sid
FROM Boats B2, Reserves R2
WHERE R2.bid = B2.bid AND B2.color = ‘green’

The answer contains: SID----------22 31 64 74


The default for UNION queries is that duplicates are eliminated. To retain duplicates, use
UNION ALL.
Replace UNION with UNION ALL. The answer contains: 22 31 74 22 31 64 22 31
Replace UNION with INTERSECT. The answer contains: 22 31.
Replace UNION with EXCEPT. The answer contains just the id 64.

6. Nested Query

IN and NOT IN
EXISTS and NOT EXISTS
UNIQUE and NOT UNIQUE
op ANY
op ALL

EX8: Find the names of sailors who have reserved boat 103.
SELECT S.sname
FROM Sailors S
WHERE S.sid IN ( SELECT R.sid
FROM Reserves R
WHERE R.bid = 103 )

The inner subquery has been completely independent of the outer query.
(Correlated Nested Queries)
SELECT S.sname
FROM Sailors S
WHERE EXISTS ( SELECT *
FROM Reserves R
WHERE R.bid = 103
AND R.sid = S.sid )

The inner query depends on the row that is currently being examined in the outer query.

EX9: Find the name and the age of the youngest sailor.
SELECT S.sname, S.age
FROM Sailors S
WHERE S.age <= ALL ( SELECT age
FROM Sailors )

EX10: Find the names and ratings of sailor whose rating is better than some sailor called Horatio.
SELECT S.sname, S.rating
FROM Sailors S
WHERE S.rating > ANY ( SELECT S2.rating
FROM Sailors S2
WHERE S2.sname = ‘Horatio’)

Note that IN and NOT IN are equivalent to = ANY and <> ALL, respectively.

EX11: Find the names of sailors who have reserved all boats.
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS ( ( SELECT B.bid
FROM Boats B)
EXCEPT
( SELECT R.bid
FROM Reserves R
WHERE R.sid = S.sid ))
An alternative solution:
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS ( SELECT B.bid
FROM Boats B
WHERE NOT EXISTS ( SELECT R.bid
FROM Reserves R
WHERE R.bid = B.bid
AND R.sid = S.sid ) )

7. Aggregation Operators

COUNT ([DISTINCT] A): The number of (unique) values in the A column.


SUM ([DISTINCT] A): The sum of all (unique) values in the A column.
AVG ([DISTINCT] A): The average of all (unique) values in the A column.
MAX (A): The maximum value in the A column.
MIN (A): The minimum value in the A column.

EX12: Count the number of different sailor names.


SELECT COUNT( DISTINCT S.sname )
FROM Sailors S

EX13: Calculate the average age of all sailors.


SELECT AVG(s.age)
FROM Sailors S

EX14: Find the name and the age of the youngest sailor.
SELECT S.sname, S.age
FROM Sailors S
WHERE S.age = (SELECT MIN(S2.age)
FROM Sailors S2 )

SELECT [DISTINCT] select-list


FROM from-list
WHERE qualification
GROUP BY grouping-list
HAVING group-qualification

EX15: Find the average age of sailors for each rating level.
Rating avg_age SELECT S.rating, AVG(S.age) AS avg_age
1 33 FROM Sailors S
3 44.5 GROUP BY S.rating
7 40
8 40.5
9 35
10 25.5
EX16: Find the average age of sailors for each rating level that has at least two sailors.
Rating avg_age SELECT S.rating, AVG(S.age) AS avg_age
3 44.5 FROM Sailors S
7 40 GROUP BY S.rating
8 40.5 HAVING COUNT(*) > 1
10 25.5

EX16: An example shows difference between WHERE and HAVING:


Rating avg_age SELECT S.rating, AVG(S.age) as avg_age
3 63.5 FROM Sailors S
7 45 WHERE S.age >=40
8 55.5 GROUP BY S.rating

Rating avg_age SELECT S.rating, AVG(S.age) as avg_age


3 44.5 FROM Sailors S
7 40 GROUP BY S.rating
8 40.5 HAVING AVG(S.age) >= 40

5. NULL value and OUTER JOIN

In the presence of null values, any row that evaluates to false or to unknown is elim inated

The two rows are duplicates if corresponding columns are either equal, or both contain null.
(If we compare two null values using =, the result is unknown)

The arithmetic operation +, -, * and / all return null if one of their arguments is null.

Count(*) handle null values just like other values. All the other aggregate operations (COUNT, SUM,
AVG, MAX, MIN, and variations using DISTINCT) simply discard null values

After: INSERT INTO sailors


( sid, sname, rating, age )
VALUES ( 99, 'Dan', null, 48.0 ) ,

SELECT COUNT(*) FROM Sailors will return 11


SELECT COUNT(rating) FROM Sailors will return 10
SELECT COUNT(age) FROM Sailors will return 11
An example of OUTER JOIN:

SELECT sailors.sid, sailors.sname, reserves.bid


FROM sailors LEFT OUTER JOIN reserves ON reserves.sid = sailors.sid
ORDER BY sailors.sid

sid sname bid


22 Dustin 101
22 Dustin 102
22 Dustin 103
22 Dustin 104
29 Brutus
31 Lubber 102
31 Lubber 103
31 Lubber 104
32 Andy
58 Rusty
64 Horatio 101
64 Horatio 102
71 Zorba
74 Horatio 103
85 Art
95 Bob
99 Dan
Normalization – Why?
• Database normalization is the process of organizing the attributes of
database to reduce or eliminate data redundancy (having same data but
at different places) .
• Normalization is the process of minimizing redundancy from a relation or
set of relations. Redundancy in relation may cause insertion, deletion and
updation anomalies. So, it helps to minimize the redundancy in
relations. Normal forms are used to eliminate or reduce redundancy in
database tables.
• Problems because of data redundancy
Data redundancy unnecessarily increases size of database as same data is
repeated on many places. Inconsistency problems also arise during insert,
delete and update operations.
1
Functional Dependency
• Functional dependency is denoted by arrow (→). If an attributed A functionally
determines B, then it is written as A → B.
• A function dependency A → B mean for all instances of a particular value of A, there is
same value of B.
• For example in the below table A → B is true, but B → A is not true as there are different
values of A for B = 3.
A B
------
1 3
2 3
4 0
1 3
4 0

2
Trivial and Non-trivial Functional Dependency
• X –> Y is trivial only when Y is subset of X.
Examples
ABC --> AB
ABC --> A
ABC --> ABC
• X –> Y is a non trivial functional dependencies when Y is not a subset of X.
• X –> Y is called completely non-trivial when X intersect Y is NULL.
Examples:
Id --> Name,
Name --> DOB

3
Normal Forms – 1NF
• If a relation contain composite or multi-valued attribute, it violates
1NF. A relation is in 1NF if every attribute in that relation is singled
valued attribute.

4
Normal Forms – 1NF

5
Normal Forms – 2NF
• To be in second normal form, a relation must be in first normal form
and relation must not contain any partial dependency. A relation is in
2NF iff it has No Partial Dependency, i.e., no non-prime attribute
(attributes which are not part of any candidate key) is dependent on
any proper subset of any candidate key of the table.

6
Normal Forms – 2NF
• Partial Dependency – If proper
subset of candidate key
determines non-prime attribute,
it is called partial dependency.
• Example 1 – In relation
STUDENT_COURSE given in
FD set: {COURSE_NO->COURSE_NAME} Table 3
Candidate Key: {STUD_NO, COURSE_NO}

• In FD COURSE_NO->COURSE_NAME, COURSE_NO (proper subset of


candidate key) is determining COURSE_NAME (non-prime attribute).
Hence, it is partial dependency and relation is not in second normal
form.
7
Normal Forms – 2NF
• To convert it to second normal form, we will decompose the relation
STUDENT_COURSE (STUD_NO, COURSE_NO, COURSE_NAME) as :

STUDENT_COURSE (STUD_NO, COURSE_NO)


COURSE (COURSE_NO, COURSE_NAME)

8
Normal Forms – 3NF
• A relation is in third normal form, if there is no transitive
dependency for non-prime attributes is it is in second normal form.
A relation is in 3NF iff at least one of the following condition holds in
every non-trivial function dependency X –> Y
• X is a super key.
• Y is a prime attribute (each element of Y is part of some candidate key).

9
Normal Forms – 3NF

• Transitive dependency – If A->B and B->C are two FDs then A->C is called transitive
dependency.
• Example 1 – In relation STUDENT given in Table 4,FD set: {STUD_NO ->
STUD_NAME, STUD_NO -> STUD_STATE, STUD_NO -> STUD_COUNTRY, STUD_NO ->
STUD_AGE, STUD_STATE -> STUD_COUNTRY}
Candidate Key: {STUD_NO}
• For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true. So STUD_COUNTRY is transitively dependent on
STUD_NO. It violates third normal form.
10
Normal Forms – 3NF
• To convert it in third normal form, we will decompose the relation
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_COUNTRY_STUD_AGE) as:

STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,


STUD_AGE)

STATE_COUNTRY (STATE, COUNTRY)

11
Normal Forms – BCNF
• A relation R is in BCNF if R is in Third Normal Form and for every FD,
LHS is super key.

• A relation is in BCNF iff in every non-trivial functional dependency


X –> Y, X is a super key.

12
Normal Forms – BCNF
• For example, Stu_ID is the key and only prime key attribute.

• We find that City can be identified by Stu_ID as well as Zip itself.


Neither Zip is a superkey nor is City a prime attribute. Additionally,
Stu_ID → Zip → City, so there exists transitive dependency.

13
Normal Forms – BCNF
• To bring this relation into third normal form, we break the relation into two
relations as follows

• In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip
is the super-key in the relation ZipCodes. So,
Stu_ID → Stu_Name, Zip
And Zip → City, Which confirms that both the relations are in BCNF

14
Multivalued Dependency
• If two or more independent relation are kept in a single relation or we
can say multivalue dependency occurs when the presence of one or
more rows in a table implies the presence of one or more other rows
in that same table. Put another way, two attributes (or columns) in a
table are independent of one another, but both depend on a third
attribute. A multivalued dependency always requires at least three
attributes because it consists of at least two attributes that are
dependent on a third.

15
Multivalued Dependency

• In the above table, we can see Students Amit and Akash have interest in more
than one activity.
• This is multivalued dependency because CourseDiscipline of a student are
independent of Activities, but are dependent on the student.
16
Multivalued Dependency
• Therefore, multivalued dependency:
StudentName ->-> CourseDiscipline
StudentName ->-> Activities

• The above relation violates Fourth Normal Form in Normalization.


• To correct it, divide the table into two separate tables and break
Multivalued Dependency:

17
Multivalued Dependency

This breaks the multivalued dependency and now we have two functional
dependencies:
StudentName -> CourseDiscipline
StudentName - > Activities

18
Normal Forms – 4NF
• A relation R is in 4NF if and only if the following conditions are
satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.

19
Join Dependency
• Join decomposition is a further generalization of Multivalued
dependencies. If the join of R1 and R2 over C is equal to relation R
then we can say that a join dependency (JD) exists, where R1 and R2
are the decomposition R1(A, B, C) and R2(C, D) of a given relations R
(A, B, C, D). Alternatively, R1 and R2 are a lossless decomposition of R.

20
Normal Forms – 5NF
• A relation R is in 5NF if and only if it satisfies following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency)

• Example – If a company makes a product and an agent is an agent for


that company, then he always sells that product for the company.

21
Normal Forms – 5NF
AGENT COMPANY PRODUCT AGENT COMPANY
A1 PQR Nut A1 PQR
A1 PQR Bolt A1 XYZ
A1 XYZ Nut A2 PQR
A1 XYZ Bolt
A2 PQR Nut AGENT PRODUCT
A1 Nut
COMPANY PRODUCT A1 Bolt
PQR Nut A2 Nut
PQR Bolt
XYZ Nut
XYZ Bolt
22

You might also like