0% found this document useful (0 votes)
30 views

Database Management Systems Week 3

Database Management Systems

Uploaded by

matt.0.porter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Database Management Systems Week 3

Database Management Systems

Uploaded by

matt.0.porter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Adapted for SEN2104 – DBMS

Week 3: Relational Model


and Relational Algebra (Part I)

Database System Concepts


©Silberschatz, Korth and Sudarshan
Relational Model
▪ Structure of Relational Databases
▪ Fundamental Relational-Algebra-Operations
▪ Additional Relational-Algebra-Operations
▪ Extended Relational-Algebra-Operations
▪ Null Values
▪ Modification of the Database

2
Example of a Relation

3
Attribute Types
▪ Each attribute of a relation has a name
▪ The set of allowed values for each attribute is called the domain of the
attribute
▪ Attribute values are (normally) required to be atomic; that is, indivisible
▪ E.g. the value of an attribute can be an account number,
but cannot be a set of account numbers
▪ Domain is said to be atomic if all its members are atomic
▪ The special value null is a member of every domain
▪ The null value causes complications in the definition of many operations
▪ We shall ignore the effect of null values in our main presentation
and consider their effect later

4
Relation Schema
▪ Formally, given domains D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di
▪ Schema of a relation consists of
▪ attribute definitions
▪ name
▪ type/domain
▪ integrity constraints

5
Relation Instance
▪ The current values (relation instance) of a relation are specified by a
table
▪ An element t of r is a tuple, represented by a row in a table
▪ Order of tuples is irrelevant (tuples may be stored in an arbitrary
order)
attributes
(or columns)
customer_name customer_street customer_city

Jones Main Harrison


Smith North Rye tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield

customer

6
Database
▪ A database consists of multiple relations
▪ Information about an enterprise is broken up into parts, with each
relation storing one part of the information
▪ E.g.
account : information about accounts
depositor : which customer owns which account
customer : information about customers

7
Database (Cont.)

account

customer

depositor

8
Why Split Information Across Relations?
▪ Storing all information as a single relation such as
bank(account_number, balance, customer_name, ..)
results in

▪ repetition of information
▪ e.g.,if two customers own an account (What gets repeated?)
▪ the need for null values
▪ e.g., to represent a customer without an account
▪ Normalization theory deals with how to design relational schemas

9
Keys
▪ Let K  R
▪ K is a superkey of R if values for K are sufficient to identify a unique tuple of
each possible relation r(R)
▪ by “possible r ” we mean a relation r that could exist in the enterprise we
are modeling.
▪ Example: {customer_name, customer_street} and
{customer_name}
are both superkeys of Customer, if no two customers can possibly have
the same name
▪ In real life, an attribute such as “customer_id” would be used
instead of “customer_name” to uniquely identify customers, but
we omit it to keep our examples small, and instead assume
customer names are unique.

10
Keys (Cont.)
▪ K is a candidate key if K is minimal
Example: {customer_name} is a candidate key for Customer, since it
is a superkey and no subset of it is a superkey.
▪ Primary key: a candidate key chosen as the principal means of
identifying tuples within a relation
▪ Should choose an attribute whose value never, or very rarely,
changes.
▪ E.g. email address is unique, but may change

11
Example
▪ Identify possible super keys, candidate keys and primary keys for the
given entity set below:

ssn sid lname fname telephone

432-86-4013 S1 Sharp James 276-633-5562

579-92-2233 S2 Ross Lily 618-369-1926

524-24-1011 S3 Adams Abby 978-834-6780

035-20-8796 S4 Ross James 618-369-1926

767-32-9182 S5 Briggs Linda 704-990-3447

213-21-3748 S6 Adams Abby 303-693-6395

12
Foreign Keys
▪ A relation schema may have an attribute that corresponds to the primary
key of another relation. The attribute is called a foreign key.
▪ E.g. customer_name and account_number attributes of depositor are
foreign keys to customer and account respectively.
▪ Only values occurring in the primary key attribute of the referenced
relation may occur in the foreign key attribute of the referencing
relation.

depositor

customer
13
E-R Diagram

14
Schema Diagram

15
Query Languages
▪ Language in which user requests information from the database.
▪ Categories of languages
▪ Procedural
▪ Non-procedural, or declarative
▪ “Pure” languages:
▪ Relational algebra
▪ Tuple relational calculus
▪ Domain relational calculus
▪ Pure languages form underlying basis of query languages that people
use.

16
Relational Algebra
▪ Procedural language
▪ Six basic operators

▪ select: 
▪ project: 
▪ union: 
▪ set difference: –
▪ Cartesian product: x
▪ rename: 
▪ The operators take one or two relations as inputs and produce a new
relation as a result.

17
Select Operation
▪ Notation:  p(r)
▪ p is called the selection predicate
▪ Defined as:

p(r) = {t | t  r and p(t)}

Where p is a formula in propositional calculus consisting of terms


connected by :  (and),  (or),  (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <. 

▪ Example of selection:

 branch_name=“Perryridge”(account)

18
Select Operation – Example
▪ Relation r
A B C D

  1 7
  5 7
  12 3
  23 10

▪ A=B ^ D > 5 (r)


A B C D

  1 7
  23 10

19
Project Operation
▪ Notation:
 A1 , A2 ,, Ak
(r )
where A1, A2 are attribute names and r is a relation name.
▪ The result is defined as the relation of k columns obtained by erasing
the columns that are not listed
▪ Duplicate rows removed from result, since relations are sets
▪ Example: To eliminate the branch_name attribute of account

account_number, balance (account)

20
Project Operation – Example
▪ Relation r: A B C

 10 1
 20 1
 30 1
 40 2

A,C (r) A C A C

 1  1
 1 =  1
 1  2
 2

21
Union Operation
▪ Notation: r  s
▪ Defined as:
r  s = {t | t  r or t  s}
▪ For r  s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
▪ Example: to find all customers with either an account or a loan
customer_name (depositor)  customer_name (borrower)

22
Union Operation – Example
▪ Relations r, s: A B A B

 1  2
 2  3
 1 s
r

A B

▪ r  s:  1
 2
 1
 3

23
Set Difference Operation
▪ Notation r – s
▪ Defined as:
r – s = {t | t  r and t  s}

▪ Set differences must be taken between compatible


relations.
▪ r and s must have the same arity
▪ attribute domains of r and s must be compatible

24
Set Difference Operation – Example
▪ Relations r, s:
A B A B

 1  2
 2  3
 1 s
r

▪ r – s:
A B

 1
 1

25
Cartesian-Product Operation
▪ Notation r x s
▪ Defined as:
r x s = {t q | t  r and q  s}

▪ Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S = ).
▪ If attributes of r and s are not disjoint, then renaming must be used.

26
Cartesian-Product Operation – Example
▪ Relations r, s:
A B C D E

 1  10 a
 10 a
 2
 20 b
r  10 b
s
▪ r x s:
A B C D E
 1  10 a
 1  10 a
 1  20 b
 1  10 b
 2  10 a
 2  10 a
 2  20 b
 2  10 b

27
The borrower and loan relations

loan

borrower

28
Result of borrower X loan

29
Composition of Operations
▪ Can build expressions using multiple operations
▪ Example: A=C(r x s)
▪ rxs
A B C D E
 1  10 a
 1  10 a
 1  20 b
 1  10 b
 2  10 a
 2  10 a
 2  20 b
 2  10 b
▪ A=C(r x s)

A B C D E

 1  10 a
 2  10 a
 2  20 b

30
Rename Operation
▪ Allows us to name, and therefore to refer to, the results of relational-
algebra expressions.
▪ Allows us to refer to a relation by more than one name.
▪ Example:
 x (E)

returns the expression E under the name X


▪ If a relational-algebra expression E has arity n, then

 x ( A ,A
1 2 ,..., An )
(E )
returns the result of expression E under the name X, and with the
attributes renamed to A1 , A2 , …., An .

31
Banking Enterprise Schema Diagram

branch (branch-name, branch-city, assets)


customer (customer-name, customer-street, customer-city)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)

32
Example Queries

▪ Find all loans of over $1200

amount > 1200 (loan)

▪ Find the loan number for each loan of an amount greater than $1200

loan_number (amount > 1200 (loan))

▪ Find the names of all customers who have a loan, an account, or both,
from the bank

customer_name (borrower)  customer_name (depositor)

33
borrower

loan borrower x loan

34
Example Queries
▪ Find the names of all customers who have a loan at the Perryridge
branch.
customer_name (branch_name=“Perryridge”
(borrower.loan_number = loan.loan_number(borrower x loan)))

▪ Find the names of all customers who have a loan at the Perryridge
branch but do not have an account at any branch of the bank.

customer_name (branch_name = “Perryridge”

(borrower.loan_number = loan.loan_number(borrower x loan))) –


customer_name(depositor)

35
Example Queries
▪ Find the names of all customers who have a loan at the Perryridge branch.

▪ customer_name (branch_name = “Perryridge” (


borrower.loan_number = loan.loan_number (borrower x loan)))

▪ customer_name(loan.loan_number = borrower.loan_number (
(branch_name = “Perryridge” (loan)) x borrower))

36

You might also like