Database Management System (22CSH-242)
Lecture 3: Relational Algebra
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
Data Models: Relational Model, Network Model, Hierarchical Model, ER Model:
Design, issues, Mapping constraints, ER diagram, Comparison of
Models
Relational Introduction, Syntax, Semantics, Additional operators, Grouping and
Algebra & Ungrouping, Relational comparisons, Tuple Calculus, Domain
Relational Calculus, Calculus Vs Algebra, Computational capabilities
Calculus:
Roadmap
Relational Algebra
Join Operations
Grouping
Relational Calculus
Introduction
• Relational algebra and relational calculus are formal
languages associated with the relational model.
• Informally, relational algebra is a (high-level)
procedural language and relational calculus a non-
procedural language.
• However, formally both are equivalent to one another.
• A language that produces a relation that can be derived
using relational calculus is relationally complete.
Relational Algebra
• Relational algebra operations work on one or
more relations to define another relation without
changing the original relations.
• Both operands and results are relations, so output
from one operation can become input to another
operation.
• Allows expressions to be nested, just as in
arithmetic. This property is called closure.
Relational Algebra
• Five basic operations in relational algebra:
Selection, Projection, Cartesian product,
Union, and Set Difference.
• These perform most of the data retrieval
operations needed.
• Also have Join, Intersection, and Division
operations, which can be expressed in terms of
5 basic operations.
Relational Algebra Operations
Relational Algebra Operations
Selection (or Restriction)
• predicate (R)
– Works on a single relation R and defines a relation
that contains only those tuples (rows) of R that
satisfy the specified condition (predicate).
Example - Selection (or Restriction)
• List all staff with a salary greater than
£10,000.
salary > 10000 (Staff)
Projection
• col1, . . . , coln(R)
– Works on a single relation R and defines a
relation that contains a vertical subset of R,
extracting the values of specified attributes and
eliminating duplicates.
Example - Projection
• Produce a list of salaries for all staff,
showing only staffNo, fName, lName, and
salary details.
staffNo, fName, lName, salary(Staff)
Union
• RS
– Union of two relations R and S defines a relation
that contains all the tuples of R, or S, or both R
and S, duplicate tuples being eliminated.
– R and S must be union-compatible.
• If R and S have I and J tuples, respectively,
union is obtained by concatenating them into
one relation with a maximum of (I + J) tuples.
Example - Union
• List all cities where there is either a branch
office or a property for rent.
city(Branch) city(PropertyForRent)
Set Difference
• R–S
– Defines a relation consisting of the tuples that are
in relation R, but not in S.
– R and S must be union-compatible.
Example - Set Difference
• List all cities where there is a branch office
but no properties for rent.
city(Branch) – city(PropertyForRent)
Intersection
• RS
– Defines a relation consisting of the set of all
tuples that are in both R and S.
– R and S must be union-compatible.
• Expressed using basic operations:
R S = R – (R – S)
Example - Intersection
• List all cities where there is both a branch
office and at least one property for rent.
city(Branch) city(PropertyForRent)
Cartesian product
• RXS
– Defines a relation that is the concatenation of
every tuple of relation R with every tuple of
relation S.
Example - Cartesian product
• List the names and comments of all clients who have
viewed a property for rent.
(clientNo, fName, lName(Client)) X (clientNo, propertyNo, comment
(Viewing))
Example - Cartesian product and Selection
• Use selection operation to extract those tuples where
Client.clientNo = Viewing.clientNo.
sClient.clientNo = Viewing.clientNo((ÕclientNo, fName, lName(Client)) (ÕclientNo,
propertyNo, comment(Viewing)))
Cartesian product and Selection can be reduced to a single operation called a
Join.
Join Operations
• Join is a derivative of Cartesian product.
• Equivalent to performing a Selection, using
join predicate as selection formula, over
Cartesian product of the two operand
relations.
• One of the most difficult operations to
implement efficiently in an RDBMS and one
reason why RDBMSs have intrinsic
performance problems.
Join Operations
• Various forms of join operation
– Theta join
– Equijoin (a particular type of Theta join)
– Natural join
– Outer join
– Semijoin
Theta join (-join)
• R FS
– Defines a relation that contains tuples
satisfying the predicate F from the Cartesian
product of R and S.
– The predicate F is of the form R.ai S.bi where
may be one of the comparison operators (<,
, >, , =, ).
Theta join (-join)
• Can rewrite Theta join using basic Selection
and Cartesian product operations.
R FS = F(R S)
Degree of a Theta join is sum of degrees of
the operand relations R and S. If predicate F
contains only equality (=), the term Equijoin is
used.
Example - Equijoin
• List the names and comments of all clients
who have viewed a property for rent.
(clientNo, fName, lName(Client)) Client.clientNo = Viewing.clientNo
(clientNo, propertyNo, comment(Viewing))
Natural join
• R S
– An Equijoin of the two relations R and S over all
common attributes x. One occurrence of each
common attribute is eliminated from the result.
Example - Natural join
• List the names and comments of all clients
who have viewed a property for rent.
(clientNo, fName, lName(Client))
(clientNo, propertyNo, comment(Viewing))
Outer join
• To display rows in the result that do not have
matching values in the join column, use Outer
join.
• R S
– (Left) outer join is join in which tuples from R
that do not have matching values in common
columns of S are also included in result
relation.
29
Example - Left Outer join
• Produce a status report on property
viewings.
propertyNo, street, city(PropertyForRent)
Viewing
Semijoin
• R F S
– Defines a relation that contains the tuples of R that
participate in the join of R with S.
Can rewrite Semijoin using Projection and Join:
R F S = A(R F S)
Example - Semijoin
• List complete details of all staff who work at the
branch in Glasgow.
Staff Staff.branchNo=Branch.branchNo (city=‘Glasgow’(Branch))
Division
• RS
– Defines a relation over the attributes C that
consists of set of tuples from R that match
combination of every tuple in S.
• Expressed using basic operations:
T1 C(R)
T2 C((S X T1) – R)
T T1 – T 2
Example - Division
• Identify all clients who have viewed all
properties with three rooms.
(clientNo, propertyNo(Viewing))
(propertyNo(rooms = 3 (PropertyForRent)))
Aggregate Operations
• AL(R)
– Applies aggregate function list, AL, to R to define
a relation over the aggregate list.
– AL contains one or more (<aggregate_function>,
<attribute>) pairs .
• Main aggregate functions are: COUNT, SUM,
AVG, MIN, and MAX.
Example – Aggregate Operations
• How many properties cost more than £350 per
month to rent?
R(myCount) COUNT propertyNo (σrent > 350
(PropertyForRent))
Grouping Operation
• AL(R)
GA
– Groups tuples of R by grouping attributes, GA,
and then applies aggregate function list, AL, to
define a new relation.
– AL contains one or more (<aggregate_function>,
<attribute>) pairs.
– Resulting relation contains the grouping
attributes, GA, along with results of each of the
aggregate functions.
Example – Grouping Operation
• Find the number of staff working in each branch
and the sum of their salaries.
R(branchNo, myCount, mySum)
branchNo COUNT staffNo, SUM salary (Staff)
Relational Calculus
• Relational calculus query specifies what is to be
retrieved rather than how to retrieve it.
– No description of how to evaluate a query.
• In first-order logic (or predicate calculus), predicate
is a truth-valued function with arguments.
• When we substitute values for the arguments,
function yields an expression, called a proposition,
which can be either true or false.
Relational Calculus
• If predicate contains a variable (e.g. ‘x is a
member of staff’), there must be a range for x.
• When we substitute some values of this range
for x, proposition may be true; for other values, it
may be false.
• When applied to databases, relational calculus
has forms: tuple and domain.
Tuple Relational Calculus
• Interested in finding tuples for which a predicate is
true. Based on use of tuple variables.
• Tuple variable is a variable that ‘ranges over’ a named
relation: i.e., variable whose only permitted values are
tuples of the relation.
• Specify range of a tuple variable S as the Staff relation
as:
Staff(S)
• To find set of all tuples S such that P(S) is true:
{S | P(S)}
Tuple Relational Calculus -
Example
• To find details of all staff earning more than
£10,000:
{S | Staff(S) S.salary > 10000}
• To find a particular attribute, such as salary,
write:
{S.salary | Staff(S) S.salary > 10000}
Tuple Relational Calculus
• Can use two quantifiers to tell how many instances
the predicate applies to:
– Existential quantifier $ (‘there exists’)
– Universal quantifier " (‘for all’)
• Tuple variables qualified by " or $ are called
bound variables, otherwise called free variables.
Tuple Relational Calculus
• Existential quantifier used in formulae that must
be true for at least one instance, such as:
Staff(S) Ù ($B)(Branch(B) Ù
(B.branchNo = S.branchNo) Ù B.city = ‘London’)
• Means ‘There exists a Branch tuple with
same branchNo as the branchNo of the
current Staff tuple, S, and is located in
London’.
Tuple Relational Calculus
• Universal quantifier is used in statements about
every instance, such as:
("B) (B.city ‘Paris’)
• Means ‘For all Branch tuples, the address is not in
Paris’.
• Can also use ~($B) (B.city = ‘Paris’) which means
‘There are no branches with an address in Paris’.
Tuple Relational Calculus
• Formulae should be unambiguous and make sense.
• A (well-formed) formula is made out of atoms:
• R(Si), where Si is a tuple variable and R is a relation
• Si.a1 q Sj.a2
• Si.a1 q c
• Can recursively build up formulae from atoms:
• An atom is a formula
• If F1 and F2 are formulae, so are their conjunction, F1 Ù F2;
disjunction, F1 Ú F2; and negation, ~F1
• If F is a formula with free variable X, then ($X)(F) and ("X)
(F) are also formulae.
Example - Tuple Relational
Calculus
• List the names of all managers who earn more
than £25,000.
{S.fName, S.lName | Staff(S)
S.position = ‘Manager’ S.salary > 25000}
• List the staff who manage properties for rent in
Glasgow.
{S | Staff(S) ($P) (PropertyForRent(P) (P.staffNo =
S.staffNo) Ù P.city = ‘Glasgow’)}
Example - Tuple Relational
Calculus
• List the names of staff who currently do not
manage any properties.
{S.fName, S.lName | Staff(S) (~($P)
(PropertyForRent(P)(S.staffNo = P.staffNo)))}
Or
{S.fName, S.lName | Staff(S) ((P) (~PropertyForRent(P)
~(S.staffNo = P.staffNo)))}
48
Example - Tuple Relational
Calculus
• List the names of clients who have viewed a
property for rent in Glasgow.
{C.fName, C.lName | Client(C) Ù (($V)($P)
(Viewing(V) Ù PropertyForRent(P) Ù
(C.clientNo = V.clientNo) Ù
(V.propertyNo=P.propertyNo) Ù
P.city =‘Glasgow’))}
Tuple Relational Calculus
• Expressions can generate an infinite set. For
example:
{S | ~Staff(S)}
• To avoid this, add restriction that all values in
result must be values in the domain of the
expression.
Domain Relational Calculus
• Uses variables that take values from domains
instead of tuples of relations.
• If F(d1, d2, . . . , dn) stands for a formula composed
of atoms and d1, d2, . . . , dn represent domain
variables, then:
{d1, d2, . . . , dn | F(d1, d2, . . . , dn)}
is a general domain relational calculus expression.
Example - Domain Relational
Calculus
• Find the names of all managers who earn more
than £25,000.
{fN, lN | ($sN, posn, sex, DOB, sal, bN)
(Staff (sN, fN, lN, posn, sex, DOB, sal, bN)
posn = ‘Manager’ sal > 25000)}
Example - Domain Relational
Calculus
• List the staff who manage properties for rent in Glasgow.
{sN, fN, lN, posn, sex, DOB, sal, bN |
($sN1,cty)(Staff(sN,fN,lN,posn,sex,DOB,sal,bN)
PropertyForRent(pN, st, cty, pc, typ, rms,
rnt, oN, sN1, bN1) Ù
(sN=sN1) Ù cty=‘Glasgow’)}
Example - Domain Relational
Calculus
• List the names of staff who currently do not
manage any properties for rent.
{fN, lN | ($sN)
(Staff(sN,fN,lN,posn,sex,DOB,sal,bN)
(~($sN1) (PropertyForRent(pN, st, cty, pc, typ,
rms, rnt, oN, sN1, bN1) Ù (sN=sN1))))}
Example - Domain Relational
Calculus
• List the names of clients who have viewed a
property for rent in Glasgow.
{fN, lN | ($cN, cN1, pN, pN1, cty)
(Client(cN, fN, lN,tel, pT, mR)
Viewing(cN1, pN1, dt, cmt)
PropertyForRent(pN, st, cty, pc, typ,
rms, rnt,oN, sN, bN) Ù
(cN = cN1) Ù (pN = pN1) Ù cty = ‘Glasgow’)}
Domain Relational Calculus
• When restricted to safe expressions, domain
relational calculus is equivalent to tuple
relational calculus restricted to safe expressions,
which is equivalent to relational algebra.
• Means every relational algebra expression has
an equivalent relational calculus expression, and
vice versa.