Lecture 06: Relational Algebra &
Relational Calculus
Dr. Dang Tran Khanh
Department of Information Systems
Faculty of Computer Science and Engineering
[email protected]Outline
The COMPANY Database
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Relational Calculus
– Tuple Relational Calculus
– Domain Relational Calculus
Pre-Midterm Exam (60’)
Reading Suggestion:
– [1]: Chapter 6
The ERD for the COMPANY database
Result of mapping the COMPANY ER schema into a
relational schema
Relational Algebra
The basic set of operations for the relational model is
known as the relational algebra. These operations enable a
user to specify basic retrieval requests
The result of a retrieval is a new relation, which may have
been formed from one or more relations. The algebra
operations thus produce new relations, which can be
further manipulated using operations of the same algebra
A sequence of relational algebra operations forms a
relational algebra expression, whose result will also be a
relation that represents the result of a database query (or
retrieval request)
Relational Algebra
Unary Relational Operations
SELECT Operation
SELECT operation is used to select a subset of the tuples from a relation that
satisfy a selection condition
Example: To select the EMPLOYEE tuples whose department number is 4 or
those whose salary is greater than $30,000 the following notation is used:
σ DNO = 4 (EMPLOYEE)
σ SALARY > 30,000 (EMPLOYEE)
In general, the select operation is denoted by
σ <selection condition>(R)
where the symbol σ (sigma) is used to denote the select operator, and
the selection condition is a Boolean expression specified on the
attributes of relation R
Relational Algebra
Unary Relational Operations
SELECT Operation Properties
– The SELECT operation σ <selection condition>(R) produces a
relation S that has the same schema as R
– The SELECT operation σ is commutative, i.e., a cascaded
SELECT operation may be applied in any order
σ <condition1>(σ < condition2> ( R)) = σ <condition2> (σ < condition1> ( R))
σ <condition1>(σ < condition2> (σ <condition3> ( R)))
= σ <condition2> (σ < condition3> (σ < condition1> ( R)))
– A cascaded SELECT operation may be replaced by a single
selection with a conjunction of all the conditions; i.e.,
σ <condition1>(σ < condition2> (σ <condition3> ( R)))
= σ <condition1> AND < condition2> AND < condition3> ( R)
Results of SELECT and PROJECT operations
(a) σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
(b) πLNAME, FNAME, SALARY(EMPLOYEE)
(c) πSEX, SALARY(EMPLOYEE)
Relational Algebra
Unary Relational Operations
PROJECT Operation
This operation selects certain columns from the table and discards the other
columns
Example: To list each employee’s first and last name and salary, the
following is used:
π LNAME, FNAME,SALARY(EMPLOYEE)
π
The general form of the project operation is <attribute list>(R) where π
(pi) is the symbol used to represent the project operation and <attribute list>
is the desired list of attributes from the attributes of relation R.
The project operation removes any duplicate tuples, so the result of the
project operation is a set of tuples
Relational Algebra
Unary Relational Operations
PROJECT Operation Properties
–
The number of tuples in the result of projection π
<list>(R) is always less or equal to the number of
tuples in R
– If the list of attributes includes a key of R, then the
number of tuples is equal to the number of tuples
in R
− π<list1>(π<list2>(R)) = π<list1>(R) as long as <list2>
contains the attributes in <list1>
Dr. Dang Tran Khanh (
[email protected]) 10
Results of SELECT and PROJECT operations
(a) σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
(b) πLNAME, FNAME, SALARY(EMPLOYEE)
(c) πSEX, SALARY(EMPLOYEE)
Relational Algebra
Unary Relational Operations
Rename Operation
– We may want to apply several relational algebra operations one after the other.
Either we can write the operations as a single relational algebra expression by
nesting the operations, or we can apply one operation at a time and create
intermediate result relations. In the latter case, we must give names to the
relations that hold the intermediate results
Example: To retrieve the first name, last name, and salary of all employees who
work in department number 5, we must apply a select and a project operation. We
can write a single relational algebra expression as follows:
π FNAME, LNAME, SALARY(σ DNO=5 (EMPLOYEE))
or we can explicitly show the sequence of operations, giving a name to each
intermediate relation:
DEP5_EMPS ← σ DNO=5(EMPLOYEE)
RESULT ← π FNAME, LNAME, SALARY (DEP5_EMPS)
Relational Algebra
Unary Relational Operations
Rename Operation: the general Rename operation can
be expressed by any of the following forms:
− ρS(B1, B2, …, Bn )(R) is a renamed relation S based on R
with column names B1, B1, …Bn
− ρS(R) is a renamed relation S based on R (which does
not specify column names)
− ρ(B1, B2, …, Bn )(R) is a renamed relation with column
names B1, B1, …..Bn which does not specify a new
relation name
Dr. Dang Tran Khanh (
[email protected]) 13
Outline
The COMPANY Database
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Relational Calculus
– Tuple Relational Calculus
– Domain Relational Calculus
Pre-Midterm Exam (60’)
Reading Suggestion:
– [1]: Chapter 6
Relational Algebra
Relational Algebra Operations From Set Theory
UNION Operation
The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that
are either in R or in S or in both R and S. Duplicate tuples are eliminated.
Example: To retrieve the social security numbers of all employees who either work in
department 5 or directly supervise an employee who works in department 5, we can use
the union operation as follows:
DEP5_EMPS ← σ DNO=5 (EMPLOYEE)
RESULT1 ← π SSN(DEP5_EMPS)
RESULT2(SSN) ← π SUPERSSN(DEP5_EMPS)
RESULT ← RESULT1 ∪ RESULT2
The union operation produces the tuples that are in either RESULT1 or RESULT2 or
both. The two operands must be “type compatible”
Relational Algebra
Relational Algebra Operations From Set Theory
Type Compatibility
– The operand relations R1(A1, A2, ..., An) and R2(B1,
B2, ..., Bn) must have the same number of
attributes, and the domains of corresponding
attributes must be compatible; that is,
dom(Ai)=dom(Bi) for i=1, 2, ..., n
– The resulting relation for R1∪R2,R1 ∩ R2, or R1-R2
has the same attribute names as the first
operand relation R1 (by convention)
Dr. Dang Tran Khanh ([email protected]) 16
The set operations UNION, INTERSECTION, and MINUS
(a) Two union-compatible relations
(b) STUDENT ∪ INSTRUCTOR
(c) STUDENT ∩ INSTRUCTOR
(d) STUDENT – INSTRUCTOR
(e) INSTRUCTOR – STUDENT
Relational Algebra
Relational Algebra Operations From Set Theory
Notice that both union and intersection are commutative
operations; that is
R ∪ S = S ∪ R, and R ∩ S = S ∩ R
Both union and intersection can be treated as n-ary
operations applicable to any number of relations as both
are associative operations; that is
R ∪ (S ∪ T) = (R ∪ S) ∪ T, and (R ∩ S) ∩ T = R ∩ (S ∩ T)
The minus operation is not commutative; that is, in
general
R-S≠S–R
Relational Algebra
Relational Algebra Operations From Set Theory
CARTESIAN product (or cross product, cross join)
– This operation is used to combine tuples from two relations: the result of R(A1, A2, . .
., An) x S(B1, B2, . . ., Bm) is a relation Q with degree n+m attributes Q(A1, A2, . . ., An,
B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple for each
combination of tuples—one from R and one from S
– Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then
| R x S | will have nR * nS tuples
– The two operands do NOT have to be "type compatible”
Example: retrieve a list of names of each female employee’s dependents
FEMALE_EMPS ← σ SEX=’F’(EMPLOYEE)
EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES x DEPENDENT
ACTUAL_DEPENDENTS ← σ SSN=ESSN(EMP_DEPENDENTS)
RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS)
Relational Algebra
Binary Relational Operations
JOIN Operation
– The sequence of cartesian product followed by select is
used quite commonly to identify and select related tuples
from two relations, a special operation, called JOIN. It is
denoted by a
– This operation is very important for any relational database
with more than a single relation, because it allows us to
process relationships among relations.
– The general form of a join operation on two relations R(A1,
A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition>S
where R and S can be any relations that result from general
relational algebra expressions
Relational Algebra
Binary Relational Operations
Example: Suppose that we want to retrieve the name of
the manager of each department. To get the manager’s
name, we need to combine each DEPARTMENT tuple
with the EMPLOYEE tuple whose SSN value matches the
MGRSSN value in the department tuple. We do this by
using the join operation.
DEPT_MGR ← DEPARTMENT MGRSSN=SSN
EMPLOYEE
Relational Algebra
Binary Relational Operations
EQUIJOIN Operation
The most common use of join involves join conditions with equality comparisons only.
Such a join, where the only comparison operator used is =, is called an EQUIJOIN. In
the result of an EQUIJOIN we always have one or more pairs of attributes (whose
names need not be identical) that have identical values in every tuple
The JOIN seen in the previous example was EQUIJOIN
NATURAL JOIN Operation
Because one of each pair of attributes with identical values is superfluous, a new
operation called natural join—denoted by *—was created to get rid of the second
(superfluous) attribute in an EQUIJOIN condition
The standard definition of natural join requires that the two join attributes, or each pair
of corresponding join attributes, have the same name in both relations. If this is not the
case, a renaming operation is applied first
More discussions & Examples: homework !!
Relational Algebra
A Complete Set of Relational Algebra Operations
The set of operations {σ, π , ∪, - , X} is called a
complete set because any other relational algebra
expressions can be expressed by a combination of
these five operations. For example:
R ∩ S = (R ∪ S ) – ((R − S) ∪ (S − R))
R <join condition> S= σ <join condition> (R X S)
Relational Algebra
Binary Relational Operations
DIVISION Operation
– The division operation is applied to two relations R(Z)÷S(X),
where Z = X ∪ Y (Y is the set of attributes of R that are not
attributes of S
– The result of DIVISION is a relation T(Y) that includes a
tuple t if tuples tR appear in R with tR [Y] = t, and with
tR [X] = ts for every tuple ts in S, i.e., for a tuple t to appear
in the result T of the DIVISION, the values in t must appear
in R in combination with every tuple in S
The DIVISION operation
(a) Dividing SSN_PNOS by SMITH_PNOS
(b) T ← R ÷ S
Outline
The COMPANY Database
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Relational Calculus
– Tuple Relational Calculus
– Domain Relational Calculus
Pre-Midterm Exam (60’)
Reading Suggestion:
– [1]: Chapter 6
Relational Algebra
Additional Relational Operations
Aggregate Functions and Grouping
– A type of request that cannot be expressed in the basic
relational algebra is to specify mathematical aggregate
functions on collections of values from the database
– Examples of such functions include retrieving the average
or total salary of all employees or the total number of
employee tuples
– Common functions applied to collections of numeric values
include SUM, AVERAGE, MAXIMUM, and MINIMUM. The
COUNT function is used for counting tuples or values
Relational Algebra
Additional Relational Operations
More details:
[1] Chapter 6
Relational Algebra
Additional Relational Operations
Use of the Functional operator ℱ
– ℱMAX Salary (Employee) retrieves the maximum salary value
from the Employee relation
– ℱMIN Salary (Employee) retrieves the minimum Salary value
from the Employee relation
– ℱSUM Salary (Employee) retrieves the sum of the Salary from
the Employee relation
DNO ℱCOUNT SSN, AVERAGE Salary (Employee) groups employees by
–
DNO (department number) and computes the count of
employees and average salary per department
– Note: count just counts the number of rows, without
removing duplicates
Relational Algebra
Additional Relational Operations
Recursive Closure Operations
– Another type of operation that, in general, cannot be
specified in the basic original relational algebra is recursive
closure. This operation is applied to a recursive
relationship
– An example of a recursive operation is to retrieve all
SUPERVISEES of an EMPLOYEE e at all levels
– Although it is possible to retrieve employees at each level
and then take their union, we cannot, in general, specify a
query such as “retrieve the supervisees of ‘James Borg’ at
all levels” without utilizing a looping mechanism
– The SQL3 standard includes syntax for recursive closure
– Details: homework !!
Relational Algebra
Additional Relational Operations
Outer Join, Outer Union operations:
homework !!
Relational Algebra
Examples of Queries in Relational Algebra
Q1: Retrieve the name and address of all
employees who work for the ‘Research’
department
RESEARCH_DEPT ← σ DNAME=’Research’ (DEPARTMENT)
RESEARCH_EMPS ← (RESEARCH_DEPT DNUMBER=
DNOEMPLOYEE)
RESULT ← π FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
Other examples: [1] Chapter 6
Outline
The COMPANY Database
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Relational Calculus
– Tuple Relational Calculus
– Domain Relational Calculus
Pre-Midterm Exam (60’)
Reading Suggestion:
– [1]: Chapter 6
Relational Calculus
A relational calculus expression creates a new relation, which is
specified in terms of variables that range over rows of the stored
database relations (in tuple calculus) or over columns of the stored
relations (in domain calculus)
In a calculus expression, there is no order of operations to specify
how to retrieve the query result—a calculus expression specifies only
what information the result should contain. This is the main
distinguishing feature between relational algebra and relational
calculus
Relational calculus is considered to be a nonprocedural language.
This differs from relational algebra, where we must write a sequence
of operations to specify a retrieval request; hence relational algebra
can be considered as a procedural way of stating a query
Relational Calculus
Tuple Relational Calculus
The tuple relational calculus is based on specifying a number of tuple
variables. Each tuple variable usually ranges over a particular database
relation, meaning that the variable may take as its value any individual tuple
from that relation
A simple tuple relational calculus query is of the form {t | COND(t)} where t
is a tuple variable and COND (t) is a conditional expression involving t
Example: To find the first and last names of all employees whose salary is
above $50,000, we can write the following tuple calculus expression:
{t.FNAME, t.LNAME | EMPLOYEE(t) AND t.SALARY>50000}
The condition EMPLOYEE(t) specifies that the range relation of tuple
variable t is EMPLOYEE. The first and last name (π FNAME, LNAME) of each
EMPLOYEE tuple t that satisfies the condition t.SALARY>50000 (σ SALARY
>50000) will be retrieved
Relational Calculus
The Existential and Universal Quantifiers
Two special symbols called quantifiers can
appear in formulas; these are the universal
quantifier (∀ ) and the existential quantifier (∃ )
Informally, a tuple variable t is bound if it is
quantified, meaning that it appears in an (∀ t) or
(∃ t) clause; otherwise, it is free
Relational Calculus
The Existential and Universal Quantifiers
Example 1: retrieve the name and address of all
employees who work for the ‘Research’ dept.
{t.FNAME, t.LNAME, t.ADDRESS | EMPLOYEE(t)
and ( ∃ d)
(DEPARTMENT(d) and d.DNAME=‘Research’ and
d.DNUMBER=t.DNO) }
Relational Calculus
The Existential and Universal Quantifiers
Example 2: find the names of employees who
work on all the projects controlled by department
number 5
{e.LNAME, e.FNAME | EMPLOYEE(e) and (( ∀ x)
(not(PROJECT(x)) or not(x.DNUM=5)
OR ((∃ w)(WORKS_ON(w) and w.ESSN=e.SSN
and x.PNUMBER=w.PNO))))}
Details: [1] Chapter 6
Relational Calculus
Domain Relational Calculus
Another variation of relational calculus called the domain relational calculus,
or simply, domain calculus is equivalent to tuple calculus and to relational
algebra
QBE (Query-By-Example): see Appendix D
Domain calculus differs from tuple calculus in the type of variables used in
formulas: rather than having variables range over tuples, the variables range
over single values from domains of attributes. To form a relation of degree n
for a query result, we must have n of these domain variables—one for each
attribute
An expression of the domain calculus is of the form
{x1, x2, . . ., xn | COND(x1, x2, . . ., xn, xn+1, xn+2, . . ., xn+m)}
where x1, x2, . . ., xn, xn+1, xn+2, . . ., xn+m are domain variables that range
over domains (of attributes) and COND is a condition or formula of the
domain relational calculus
Relational Calculus
Domain Relational Calculus
Example: Retrieve the birthdate and address of
the employee whose name is ‘John B. Smith’.
{uv | (∃ q) (∃ r) (∃ s) (∃ t) (∃ w) (∃ x) (∃ y) (∃ z)
(EMPLOYEE(qrstuvwxyz) and q=’John’ and r=’B’
and s=’Smith’)}
Summary
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Relational Calculus
– Tuple Relational Calculus
– Domain Relational Calculus
Pre-Midterm Exam (60’)
Reading Suggestion & Homework: do not forget !!
Next Lecture: (after Midterm Exam on April 09, 2006)
– SQL (Structured Query Language)
– [1]: Chapters 8, 9
Q&A