0% found this document useful (0 votes)
21 views159 pages

DBMS Notes

The document provides an overview of Database Management Systems (DBMS), detailing their structure, applications, and the concepts of data abstraction, schemas, and data manipulation languages. It discusses various data models, including the Entity-Relationship model and the relational model, along with the use of SQL for data manipulation. Additionally, it covers database architecture, keys, and the relational algebra, emphasizing the importance of logical and physical schemas in database design.

Uploaded by

kv022588
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views159 pages

DBMS Notes

The document provides an overview of Database Management Systems (DBMS), detailing their structure, applications, and the concepts of data abstraction, schemas, and data manipulation languages. It discusses various data models, including the Entity-Relationship model and the relational model, along with the use of SQL for data manipulation. Additionally, it covers database architecture, keys, and the relational algebra, emphasizing the importance of logical and physical schemas in database design.

Uploaded by

kv022588
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 1

Database Management Systems


• DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Database Applications:
• Banking: transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
University Database Example
• In this text we will be using a university database to illustrate all the concepts
• Data consists of information about:
• Students
• Instructors
• Classes
• Application program examples:
• Add new students, instructors, and courses
• Register students for courses, and generate class rosters
• Assign grades to students, compute grade point averages (GPA) and generate transcripts
View of Data – Levels of Abstraction
The need for efficiency has led designers to use
complex data structures to represent data in the
database. trained, developers hide the complexity
from users through several levels of abstraction, to
simplify users’ interactions with the system:

1. Physical Level: How the data is actually stored


using complex data structures.
2. Logical Level: What data exists and what
relationships among those data exists. Database
administrators decides this.
3. View Level: As the complexity remains at logical
level because of the variety of information
stored in a large database. View level describes
only part of the entire database.
Instances and Schemas
• Logical Schema – the overall logical structure of the database (table names, attributes and their data
types, )
• Example: The database consists of information about a set of customers and accounts in a bank
and the relationship between them
• Analogous variable declarations in a program with data types.

Customer Schema

Account Schema

• Physical schema – the overall physical structure of the database


• Instance – the actual content of the database at a particular point in time
• Analogous to the value of a variable
Physical Data Independence
• Physical Data Independence – the ability to modify the physical schema without
changing the logical schema
• Applications depend on the logical schema
• In general, the interfaces between the various levels and components should
be well defined so that changes in some parts do not seriously influence
others.

Application programs are said to exhibit physical data independence if they do not
depend on the physical schema, and thus need not be rewritten if the physical
schema changes.
Data Models
Collection of conceptual tools used for describing
• Data (tables, attributes)
Data relationships (one to one, many to many etc.)
• Data semantics
• Consistency constraints (e.g. primary key)
Entity-Relationship (E-R) Model
In DBMS (Database Management System), an entity refers to a real-world object or concept that can be
uniquely identified and about which data can be stored in the database.

In other words, An entity is anything that exists independently and can have attributes (properties) associated
with it.

Entity Type Real-world Example Possible Attributes


Student A university student Student_ID, Name, Age, Course
Book A library book ISBN, Title, Author, Year
Employee A company employee Emp_ID, Name, Department, Salary
Entity-Relationship (E-R) Model
An entity set is a collection of similar types of entities that share the same
attributes but may have different values.

Let's say you're designing a database for a university:


•An individual student like John (ID: 101, Age: 20, Course: CS) is an entity.
•The group of all students — John, Priya, Arjun, etc. — is the entity set called
Student
Entity-Relationship (E-R) Model
• A relationship is an association among several entities. For example, a
depositor relationship associates a customer with each account that
she has.
• The set of all relationships of the same type are termed as
relationship set.
E-R Diagram
The overall logical structure (schema) of a database can be expressed graphically by an
E-R diagram, which is built up from the following components:
• Rectangles, which represent entity sets
• Ellipses, which represent attributes
• Diamonds, which represent relationships among entity sets
• Lines, which link attributes to entity sets and entity sets to relationships.
Relational Model
• All the data is stored in various tables.
• Example of tabular data in the relational
model.
• Each table has multiple columns, and
each column has a unique name.
Relational Model
Practical Tip:
The relational model is at a lower level of abstraction than
the E-R model. Database designs are often carried out in
the E-R model, and then translated to the relational
model.

• The tables customer and account correspond


to the entity sets of the same name, while
• The table depositor corresponds to the
relationship set depositor
Other Models
• Object-based data models (Object-oriented and Object-relational)
• Semi-structured data model (XML)
• Older models:
• Network model
• Hierarchical model
Data Definition Language (DDL)
• Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
• DDL compiler generates a set of table templates stored in a data dictionary
• Data dictionary contains metadata (i.e., data about data)
• Database schema
• Integrity constraints
• Primary key (ID uniquely identifies instructors)
• Authorization
• Who can access what
Data Manipulation Language (DML)
• Data manipulation is
• The retrieval of information stored in the database
• The insertion of new information into the database
• The deletion of information from the database
• The modification of information stored in the database
• There are basically two types of data-manipulation language
• Procedural DML -- require a user to specify what data are needed and how to get those data.
• Declarative DML -- require a user to specify what data are needed without specifying how to get
those data.
• Declarative DMLs are usually easier to learn and use than are procedural DMLs.
• Declarative DMLs are also referred to as non-procedural DMLs
• The portion of a DML that involves information retrieval is called a query language.
Example: DML (SQL)
This query in the SQL language finds the name of the customer whose
customer-id is 192-83-7465:
select [Link]-name
from customer
where [Link]-id = 192-83-7465
Real-World Use Case
• Use Case: Apply Bonuses to Instructors FOR rec IN (SELECT ID, salary FROM instructor) LOOP
• If salary is below 60,000 → bonus = 5000 IF [Link] < 60000 THEN
bonus := 5000;
• If salary is between 60,000 and 80,000 → ELSIF [Link] <= 80000 THEN
bonus = 3000 bonus := 3000;
• Else → bonus = 1000 ELSE
bonus := 1000;
END IF;

UPDATE instructor SET salary = salary + bonus WHERE ID =


[Link];
END LOOP;
SQL Query Language
• SQL query language is nonprocedural. A query takes as input several tables (possibly only one)
and always returns a single table.
• Example to find all instructors in Comp. Sci. dept
select name
from instructor
where dept_name = 'Comp. Sci.’
• To be able to compute complex functions SQL is usually embedded in some higher-level language
• Application programs generally access databases through one of
• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a
database
JDBC IN JAVA
mport [Link].*;
public class DBExample {
• public static void main(String[] args) throws Exception {
• // 1. Load JDBC driver
• [Link]("[Link]");
• // 2. Connect to DB
• Connection conn = [Link]("jdbc:mysql://localhost:3306/university", "root",
"password");
• // 3. Create a Statement object to send SQL queries
• Statement stmt = [Link]();
• // 4. Execute a SELECT query (DML)
• ResultSet rs = [Link]("SELECT name, salary FROM instructor");
• // 5. Process the results
• while ([Link]()) {
• [Link]([Link]("name") + ": " + [Link]("salary"));
Embedded SQL in C
#include <stdio.h>
EXEC SQL BEGIN DECLARE SECTION;
char name[20];
float salary;
EXEC SQL END DECLARE SECTION;
int main() {
EXEC SQL CONNECT TO university USER 'root' IDENTIFIED BY 'password';
EXEC SQL SELECT name, salary
INTO :name, :salary
FROM instructor
WHERE ID = 'A101';
printf("Name: %s, Salary: %.2f\n", name, salary);
EXEC SQL COMMIT;
return 0;
}
Difference
Using Embedded SQL (with
Aspect Using API Directly (e.g., JDBC)
Precompiler)
As native-looking SQL inside the
How you write SQL As strings passed to function calls
code (EXEC SQL)

Variable binding Manual (e.g., [Link](1, id)) Automatic using :variable syntax

Code readability Less readable for complex queries More readable, SQL looks natural

Who converts SQL to API calls You (the programmer) Precompiler (automatically)

Compile-time SQL checks No Yes (by precompiler)

DB API calls like executeQuery, DB API calls generated by


Final Execution
executeUpdate precompiler
DataBase Users
Database System Structure
Database Applications
• Two-tier architecture -- the application resides at the client machine, where it invokes
database system functionality at the server machine
• Three-tier architecture -- the client machine acts as a front end and does not contain any
direct database calls.
• The client end communicates with an application server, usually through a forms
interface.
• The application server in turn communicates with a database system to access data.
Two-tier and three-tier architectures
Chapter 2: Intro to Relational Model

Database System Concepts, 7th Ed.


©Silberschatz, Korth and Sudarshan
[Link]

See [Link] for conditions on re-use


Outline

▪ Structure of Relational Databases


▪ Database Schema
▪ Keys
▪ Schema Diagrams
▪ Relational Query Languages
▪ The Relational Algebra

Database System Concepts - 7th Edition 2.2 ©Silberschatz, Korth and Sudarshan
Relational database - Example of a Instructor Relation

A relational database consists of a collection of tables, each of which is assigned a


unique name
attributes
(or columns)

tuples
(or rows)

Fig. The instructor


relation.

Database System Concepts - 7th Edition 2.3 ©Silberschatz, Korth and Sudarshan
Relational database

Fig. The prerequisite relation.

• Thus, in the relational model the term relation is used to refer to a table, while the
term tuple is used to refer to a row. Similarly, the term attribute refers to a column
of a table.

• We use the term relation instance to refer to a specific instance of a relation,


i.e., containing a specific set of rows. The instance of instructor shown in Figure
has 12 tuples, corresponding to 12 instructors.
Database System Concepts - 7th Edition 2.4 ©Silberschatz, Korth and Sudarshan
Relation Schema and Instance

▪ A1, A2, …, An are attributes


▪ R = (A1, A2, …, An ) is a relation schema which is the logical
design of the database
Example:
instructor = (ID, name, dept_name, salary)
▪ A relation instance r defined over schema R is denoted by r (R).
▪ The current values a relation are specified by a table
▪ An element t of relation r is called a tuple and is represented by
a row in a table

Database System Concepts - 7th Edition 2.5 ©Silberschatz, Korth and Sudarshan
Attributes

▪ The set of allowed values for each attribute is called the domain of the
attribute
▪ Attribute values are (normally) required to be atomic; that is, indivisible
▪ The special value null is a member of every domain. Indicated that the
value is “unknown”
▪ The null value causes complications in the definition of many operations

Database System Concepts - 7th Edition 2.6 ©Silberschatz, Korth and Sudarshan
Relations are Unordered

▪ Order of tuples is irrelevant (tuples may be stored in an arbitrary order)


▪ Example: instructor relation with unordered tuples

Database System Concepts - 7th Edition 2.7 ©Silberschatz, Korth and Sudarshan
Database Schema

▪ Database schema -- is the logical structure of the database.


▪ Database instance -- is a snapshot of the data in the database at a given
instant in time.
▪ Example:
• schema: instructor (ID, name, dept_name, salary)
• Instance:

Database System Concepts - 7th Edition 2.8 ©Silberschatz, Korth and Sudarshan
Keys

▪ A superkey is a set of one or more attributes that, taken collectively, allow


us to identify uniquely a tuple in the relation.
▪ Let R denote the set of attributes in the schema of relation r.
▪ Let K  R
▪ K is a superkey of R if values for K are sufficient to identify a unique tuple
of each possible relation r(R)
• Example: {ID} and {ID,name} are both superkeys of instructor.
▪ Superkey K is a candidate key if K is minimal i.e. no proper subset is a
superkey.
Example: {ID} is a candidate key for Instructor
▪ One of the candidate keys is selected to be the primary key.
• Which one?

Database System Concepts - 7th Edition 2.9 ©Silberschatz, Korth and Sudarshan
Keys

▪ A relation, say r1, may include among its attributes the primary key of
another relation, say r2. This attribute is called a foreign key from r1,
referencing r2. OR
A foreign key is an attribute (or set of attributes) in one table that refers to
the primary key of another table.

▪ The relation r1 is also called the referencing relation of the foreign key
dependency,
▪ and r2 is called the referenced relation of the foreign key
• Example: dept_name in instructor is a foreign key from instructor
referencing department

Database System Concepts - 7th Edition 2.10 ©Silberschatz, Korth and Sudarshan
Schema Diagram for University Database

Database System Concepts - 7th Edition 2.11 ©Silberschatz, Korth and Sudarshan
Class Activity

Observe several tables in the previous schema diagram especially the primary keys.
Check whether you are convinced with mentioned primary keys.

Database System Concepts - 7th Edition 2.12 ©Silberschatz, Korth and Sudarshan
Relational Algebra

▪ A procedural language consisting of a set of operations that take one or


two relations as input and produce a new relation as their result.
▪ Six basic operators
• select: 
• project: 
• Cartesian product: x
• Join
• union: 
• set difference: –
• rename: 

Database System Concepts - 7th Edition 2.13 ©Silberschatz, Korth and Sudarshan
Select Operation

▪ The select operation selects tuples that satisfy a given predicate.


▪ Notation:  p (r)
▪ p is called the selection predicate
▪ Example: select those tuples of the instructor relation where the
instructor is in the “Physics” department.
• Query
 dept_name=“Physics” (instructor)
• Result

Database System Concepts - 7th Edition 2.14 ©Silberschatz, Korth and Sudarshan
Select Operation (Cont.)

▪ We allow comparisons using


=, , >, . <. 
in the selection predicate.
▪ We can combine several predicates into a larger predicate by using the
connectives:
 (and),  (or),  (not)
▪ Example: Find the instructors in Physics with a salary greater $90,000, we
write:
 dept_name=“Physics”  salary > 90,000 (instructor)

▪ The select predicate may include comparisons between two attributes.


• Example, find all departments whose name is the same as their
building name:
•  dept_name=building (department)

Database System Concepts - 7th Edition 2.15 ©Silberschatz, Korth and Sudarshan
Practice

Consider the database:


•Student(sid, sname, major, gpa)
•Course(cid, cname, credits)
•Enrolled(sid, cid, grade)
•Professor(pid, pname, dept)
•Teaches(pid, cid)
1. Example 1: Students with GPA > 8.0 and major in 'CSE’

2. Enrollments where grade = 'A' OR grade = 'A+' AND course id = 'CS101’

3. Professors in 'CSE' department with pname not equal to 'Dr. Sharma’

4. Students with GPA < 6.0 OR major = 'ME' but not both (XOR condition)

Database System Concepts - 7th Edition 2.16 ©Silberschatz, Korth and Sudarshan
Project Operation

▪ A unary operation that returns its argument relation, with certain attributes
left out.
▪ Notation:
 A1,A2,A3 ….Ak (r)
where A1, A2, …, Ak are attribute names and r is a relation name.
▪ The result is defined as the relation of k columns obtained by erasing the
columns that are not listed
▪ Duplicate rows removed from result, since relations are sets

Database System Concepts - 7th Edition 2.17 ©Silberschatz, Korth and Sudarshan
Project Operation Example

▪ Example: eliminate the dept_name attribute of instructor


▪ Query:
ID, name, salary (instructor)
▪ Result:

Database System Concepts - 7th Edition 2.18 ©Silberschatz, Korth and Sudarshan
Composition of Relational Operations

▪ The result of a relational-algebra operation is relation and therefore of


relational-algebra operations can be composed together into a
relational-algebra expression.
▪ Consider the query -- Find the names of all instructors in the Physics
department.

name( dept_name =“Physics” (instructor))

▪ Instead of giving the name of a relation as the argument of the projection


operation, we give an expression that evaluates to a relation.

Database System Concepts - 7th Edition 2.19 ©Silberschatz, Korth and Sudarshan
Cartesian-Product Operation

▪ The Cartesian-product operation (denoted by X) allows us to combine


information from any two relations.
▪ Example: the Cartesian product of the relations instructor and teaches is
written as:
instructor X teaches
▪ The cartesian-product between r1 and r2 results in a relation containing
each possible pair of tuples: one from the r1 relation and one from
the r2 relation (see next slide)
▪ Since the instructor ID appears in both relations we distinguish between
these attribute by attaching to the attribute the name of the relation from
which the attribute originally came.
• [Link]
• [Link]

Database System Concepts - 7th Edition 2.20 ©Silberschatz, Korth and Sudarshan
Cartesian-Product Operation

Fig. Instructor and Teaches relations.

Database System Concepts - 7th Edition 2.21 ©Silberschatz, Korth and Sudarshan
The instructor X teaches table

Database System Concepts - 7th Edition 2.22 ©Silberschatz, Korth and Sudarshan
Join Operation

▪ The Cartesian-Product
instructor X teaches
associates every tuple of instructor with every tuple of teaches.
• Most of the resulting rows have information about instructors who did
NOT teach a particular course.
▪ To get only those tuples of “instructor X teaches “ that pertain to
instructors and the courses that they taught, we write:
 [Link] = [Link] (instructor x teaches ))

• We get only those tuples of “instructor X teaches” that pertain to


instructors and the courses that they taught.
▪ The result of this expression, shown in the next slide

Database System Concepts - 7th Edition 2.23 ©Silberschatz, Korth and Sudarshan
Join Operation (Cont.)

▪ The table corresponding to:


 [Link] = [Link] (instructor x teaches))

Database System Concepts - 7th Edition 2.24 ©Silberschatz, Korth and Sudarshan
Join Operation (Cont.)

▪ The join operation allows us to combine a select operation and a


Cartesian-Product operation into a single operation.
▪ Consider relations r (R) and s (S)
▪ Let “theta” be a predicate on attributes in the schema R “union” S. The
join operation r ⋈𝜃 s is defined as follows:
𝑟 ⋈𝜃 𝑠 = 𝜎𝜃 (𝑟 × 𝑠)

▪ Thus
 [Link] = [Link] (instructor x teaches ))

▪ Can equivalently be written as


instructor ⋈ [Link] = [Link] teaches.

Database System Concepts - 7th Edition 2.25 ©Silberschatz, Korth and Sudarshan
Question

▪ What is the result of first performing the cross product of student and
advisor, and then performing a selection operation on the result with the
predicate s id = ID? (Using the symbolic notation of relational algebra, this
query can be written as 𝜎s_id=ID (student × advisor).)
▪ Result(ID, name, dept_name, tot_cred, s_id, i_id)

Database System Concepts - 7th Edition 2.26 ©Silberschatz, Korth and Sudarshan
Natural Join

▪ Output pairs of rows from the two input relations that have the same value
on all attributes that have the same name. In other words, A natural join
automatically finds attributes with the same name in both relations and equi-
join on them. Denoted by ⋈.
Example:
If both R and S have attribute C, R⋈S will join on C.

Fig. Result of natural join of the instructor and department relations.


Database System Concepts - 7th Edition 2.27 ©Silberschatz, Korth and Sudarshan
Natural Join

Points to note:

•Natural join → common attribute shown once.

•Theta and equi-join → both attributes shown separately, even if same name.

•If there is no attribute name match, natural join degenerates into Cartesian
product.

Database System Concepts - 7th Edition 2.28 ©Silberschatz, Korth and Sudarshan
Summary (Joins)

•Theta Join (θ-join):


Joins with a general condition.
Example:
R⋈R.A>[Link]
•Equi-Join:
Special case of θ-join where condition is equality.
Example:
R⋈R.C=[Link]
•Natural Join (⋈):
Joins automatically on all attributes with the same name.
Example:
If both R and S have attribute C, R⋈S will join on C.

Database System Concepts - 7th Edition 2.29 ©Silberschatz, Korth and Sudarshan
Practice Question

Database System Concepts - 7th Edition 2.30 ©Silberschatz, Korth and Sudarshan
Practice Question

Given above

Database System Concepts - 7th Edition 2.31 ©Silberschatz, Korth and Sudarshan
Union Operation

▪ The union operation allows us to combine two relations


▪ Notation: r  s
▪ For r  s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd
column of r deals with the same type of values as does the
2nd column of s)
▪ Example: to find all courses taught in the Fall 2017 semester, or in the
Spring 2018 semester, or in both
course_id ( semester=“Fall” Λ year=2017 (section)) 
course_id ( semester=“Spring” Λ year=2018 (section))

Database System Concepts - 7th Edition 2.32 ©Silberschatz, Korth and Sudarshan
Union Operation (Cont.)

▪ Result of:
course_id ( semester=“Fall” Λ year=2017 (section)) 
course_id ( semester=“Spring” Λ year=2018 (section))

Database System Concepts - 7th Edition 2.33 ©Silberschatz, Korth and Sudarshan
Set-Intersection Operation

▪ The set-intersection operation allows us to find tuples that are in both


the input relations.
▪ Notation: r  s
▪ Assume:
• r, s have the same arity
• attributes of r and s are compatible
▪ Example: Find the set of all courses taught in both the Fall 2017 and the
Spring 2018 semesters.
course_id ( semester=“Fall” Λ year=2017 (section)) 
course_id ( semester=“Spring” Λ year=2018 (section))

• Result

Database System Concepts - 7th Edition 2.34 ©Silberschatz, Korth and Sudarshan
Set Difference Operation

▪ The set-difference operation allows us to find tuples that are in one relation
but are not in another.
▪ Notation r – s
▪ Set differences must be taken between compatible relations.
• r and s must have the same arity
• attribute domains of r and s must be compatible
▪ Example: to find all courses taught in the Fall 2017 semester, but not in the
Spring 2018 semester
course_id ( semester=“Fall” Λ year=2017 (section)) −
course_id ( semester=“Spring” Λ year=2018 (section))

Database System Concepts - 7th Edition 2.35 ©Silberschatz, Korth and Sudarshan
The Assignment Operation

▪ It is convenient at times to write a relational-algebra expression by


assigning parts of it to temporary relation variables.
▪ The assignment operation is denoted by  and works like assignment in
a programming language.
▪ Example: Find all instructor in the “Physics” and Music department.

Physics   dept_name=“Physics” (instructor)


Music   dept_name=“Music” (instructor)
Physics  Music

▪ With the assignment operation, a query can be written as a sequential


program consisting of a series of assignments followed by an expression
whose value is displayed as the result of the query.

Database System Concepts - 7th Edition 2.36 ©Silberschatz, Korth and Sudarshan
Group by ()

▪ Consider the following relation. Suppose you are asked to find department
wise average salary of the instructors.
▪ In Relational Algebra, grouping is done using the aggregation operator
(γ, called "gamma").
▪ It groups tuples based on certain attributes and allows applying aggregate
functions like COUNT, SUM, AVG, MIN, MAX.

instructor_id name dept_id gender salary


1 Alice CS F 100000
2 Bob CS M 95000
3 Carol Math F 85000
4 Dave CS M 92000
5 Eve Math F 93000
6 Frank Physics M 91000
Database System Concepts - 7th Edition 2.37 ©Silberschatz, Korth and Sudarshan
Group by () cont..

γ dept_id, AVG(salary) (instructor)

Results in the following:

dept_id AVG(salary)
CS 95666.67
Math 89000.00
Physics 91000.00

SELECT dept_id, AVG(salary)


FROM instructor
GROUP BY dept_id;

Database System Concepts - 7th Edition 2.38 ©Silberschatz, Korth and Sudarshan
Group by () cont..

γ dept_id, gender, COUNT(*) (instructor)

dept_id gender COUNT(*)


CS F 1
CS M 2
Math F 2
Physics M 1

SELECT dept_id, gender, COUNT(*)


FROM instructor
GROUP BY dept_id, gender;

Database System Concepts - 7th Edition 2.39 ©Silberschatz, Korth and Sudarshan
Group by (general form)

Database System Concepts - 7th Edition 2.40 ©Silberschatz, Korth and Sudarshan
Group By Having

▪ In Relational Algebra, it’s represented by γ (grouping + aggregation)


followed by σ (selection on the aggregated result).

Example:

σ AVG(salary) > 90000 ( γ dept_id, AVG(salary) (instructor) )

dept_id AVG(salary)
CS 95666.67
Physics 91000.00

SELECT dept_id, AVG(salary)


FROM instructor
GROUP BY dept_id
HAVING AVG(salary) > 90000;

Database System Concepts - 7th Edition 2.41 ©Silberschatz, Korth and Sudarshan
Practice

Consider the relation Student(id, name, dept_id). Find the total number of
students in each department.

Database System Concepts - 7th Edition 2.42 ©Silberschatz, Korth and Sudarshan
Practice

Find the departments that have more than 50 students.

Database System Concepts - 7th Edition 2.43 ©Silberschatz, Korth and Sudarshan
Practice

Departments with High Average Salary


Relation: Instructor(iid, iname, dept, salary)
Find the departments where the average salary is greater than 90,000

Database System Concepts - 7th Edition 2.44 ©Silberschatz, Korth and Sudarshan
Practice

Find the department-wise maximum salary of instructors.

Database System Concepts - 7th Edition 2.45 ©Silberschatz, Korth and Sudarshan
Equivalent Queries

▪ There is more than one way to write a query in relational algebra.


▪ Example: Find information about courses taught by instructors in the
Physics department with salary greater than 90,000
▪ Query 1
 dept_name=“Physics”  salary > 90,000 (instructor)

▪ Query 2
 dept_name=“Physics” ( salary > 90.000 (instructor))

▪ The two queries are not identical; they are, however, equivalent -- they
give the same result on any database.

Database System Concepts - 7th Edition 2.46 ©Silberschatz, Korth and Sudarshan
Equivalent Queries

▪ There is more than one way to write a query in relational algebra.


▪ Example: Find information about courses taught by instructors in the
Physics department
▪ Query 1
dept_name=“Physics” (instructor ⋈ [Link] = [Link] teaches)

▪ Query 2
(dept_name=“Physics” (instructor)) ⋈ [Link] = [Link] teaches

▪ The two queries are not identical; they are, however, equivalent -- they
give the same result on any database.

Database System Concepts - 7th Edition 2.47 ©Silberschatz, Korth and Sudarshan
The Rename Operation

▪ The results of relational-algebra expressions do not have a name that we


can use to refer to them. The rename operator,  , is provided for that
purpose
▪ The expression:
x (E)
returns the result of expression E under the name x
▪ Another form of the rename operation:
x(A1,A2, .. An) (E)
▪ It returns the result of expression E under the name x, and with the
attributes renamed to A1, A2, . . . , An.

Database System Concepts - 7th Edition 2.48 ©Silberschatz, Korth and Sudarshan
Use Case of Rename

Self-Join (Avoid Ambiguity)


EMPLOYEE(emp_id, emp_name, manager_id)

• Each employee has a manager, who is also an employee.


• Find employee names with their manager names.

Database System Concepts - 7th Edition 2.49 ©Silberschatz, Korth and Sudarshan
Thoughtful Question

▪ What is the benefit of learning RA at all if i have to use sql in practice?

•SQL was built on top of RA and relational calculus. Every SQL query you write is internally
translated into RA (or something very close).
•If you know RA, you understand why some queries are faster and how to write efficient SQL.
•Even if SQL syntax changes between systems (MySQL vs PostgreSQL vs Oracle), RA
concepts don’t change.
•Companies sometimes ask RA/logic questions to test database fundamentals.
•Learning RA builds a base for studying query optimization, execution plans, distributed
databases, and even Big Data systems (which often implement RA-like operators).

Database System Concepts - 7th Edition 2.50 ©Silberschatz, Korth and Sudarshan
End of Chapter 2

Database System Concepts - 7th Edition 2.51 ©Silberschatz, Korth and Sudarshan
Database Design Using the E-R Model

Database System Concepts, 7th Ed.


©Silberschatz, Korth and Sudarshan
[Link]

See [Link] for conditions on re-use


Design Phases

▪ Initial phase -- characterize fully the data needs of the prospective


database users.
▪ Second phase -- choosing a data model
• Applying the concepts of the chosen data model
• Translating these requirements into a conceptual schema of the
database.
• A fully developed conceptual schema indicates the functional
requirements of the enterprise.
▪ Describe the kinds of operations (or transactions) that will be
performed on the data.

Database System Concepts - 7th Edition 6.4 ©Silberschatz, Korth and Sudarshan
Design Phases (Cont.)

▪ Final Phase -- Moving from an abstract data model to the implementation


of the database
• Logical Design – Deciding on the database schema.
▪ Database design requires that we find a “good” collection of
relation schemas.
▪ Business decision – What attributes should we record in the
database?
▪ Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
• Physical Design – Deciding on the physical layout of the database

Database System Concepts - 7th Edition 6.5 ©Silberschatz, Korth and Sudarshan
Design Alternatives

▪ In designing a database schema, we must ensure that we avoid two


major pitfalls:
• Redundancy: a bad design may result in repeat information.
▪ Redundant representation of information may lead to data
inconsistency among the various copies of information
• Incompleteness: a bad design may make certain aspects of the
enterprise difficult or impossible to model.
▪ Avoiding bad designs is not enough. There may be a large number of
good designs from which we must choose.

Database System Concepts - 7th Edition 6.6 ©Silberschatz, Korth and Sudarshan
Design Approaches

▪ Entity Relationship Model (covered in this chapter)


• Models an enterprise as a collection of entities and relationships
▪ Entity: a “thing” or “object” in the enterprise that is distinguishable
from other objects
• Described by a set of attributes
▪ Relationship: an association among several entities
• Represented diagrammatically by an entity-relationship diagram:
▪ Normalization Theory (Chapter 7)
• Formalize what designs are bad, and test for them

Database System Concepts - 7th Edition 6.7 ©Silberschatz, Korth and Sudarshan
Outline of the ER Model

Database System Concepts - 7th Edition 6.8 ©Silberschatz, Korth and Sudarshan
Entity Sets

▪ An entity is an object that exists and is distinguishable from other


objects.
• Example: specific person, company, event, plant
▪ An entity set is a set of entities of the same type that share the same
properties.
• Example: set of all persons, companies, trees, holidays
▪ An entity is represented by a set of attributes; i.e., descriptive properties
possessed by all members of an entity set.
• Example:
instructor = (ID, name, salary )
course= (course_id, title, credits)
▪ A subset of the attributes form a primary key of the entity set; i.e.,
uniquely identifying each member of the set.

Database System Concepts - 7th Edition 6.10 ©Silberschatz, Korth and Sudarshan
Representing Entity sets in ER Diagram

▪ Entity sets can be represented graphically as follows:


• Rectangles represent entity sets.
• Attributes listed inside entity rectangle
• Underline indicates primary key attributes

Database System Concepts - 7th Edition 6.12 ©Silberschatz, Korth and Sudarshan
Relationship Sets

▪ A relationship is an association among several entities


Example:
44553 (Peltier) advisor 22222 (Einstein)
student entity relationship set instructor entity
▪ A relationship set is a mathematical relation among n  2 entities, each
taken from entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En}

where (e1, e2, …, en) is a relationship


• Example:
(44553,22222)  advisor

Database System Concepts - 7th Edition 6.13 ©Silberschatz, Korth and Sudarshan
Relationship Sets (Cont.)

▪ Example: we define the relationship set advisor to denote the


associations between students and the instructors who act as their
advisors.
▪ Pictorially, we draw a line between related entities.

Database System Concepts - 7th Edition 6.14 ©Silberschatz, Korth and Sudarshan
Representing Relationship Sets via ER Diagrams

▪ Diamonds represent relationship sets.

Database System Concepts - 7th Edition 6.15 ©Silberschatz, Korth and Sudarshan
Relationship Sets (Cont.)

▪ An attribute can also be associated with a relationship set.


▪ For instance, the advisor relationship set between entity sets instructor
and student may have the attribute date which tracks when the student
started being associated with the advisor

76766 Crick 98988 Tanaka


45565 Katz 3 May 2008 12345 Shankar
10 June 2007
10101 Srinivasan 00128 Zhang
12 June 2006

98345 Kim 6 June 2009 76543 Brown


76543 Singh 30 June 2007
76653 Aoi
31 May 2007
22222 Einstein 23121 Chavez
4 May 2006

instructor 44553 Peltier

student

Database System Concepts - 7th Edition 6.16 ©Silberschatz, Korth and Sudarshan
Relationship Sets with Attributes

Database System Concepts - 7th Edition 6.17 ©Silberschatz, Korth and Sudarshan
Roles

▪ Entity sets of a relationship need not be distinct


• Each occurrence of an entity set plays a “role” in the relationship
▪ The labels “course_id” and “prereq_id” are called roles.

Database System Concepts - 7th Edition 6.18 ©Silberschatz, Korth and Sudarshan
Degree of a Relationship Set

▪ Binary relationship
• involve two entity sets (or degree two).
• most relationship sets in a database system are binary.
▪ Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
• Example: students work on research projects under the guidance of
an instructor.
• relationship proj_guide is a ternary relationship between instructor,
student, and project

Database System Concepts - 7th Edition 6.19 ©Silberschatz, Korth and Sudarshan
Non-binary Relationship Sets

▪ Most relationship sets are binary


▪ There are occasions when it is more convenient to represent
relationships as non-binary.
▪ E-R Diagram with a Ternary Relationship

Database System Concepts - 7th Edition 6.20 ©Silberschatz, Korth and Sudarshan
Complex Attributes

▪ Attribute types:
• Simple and composite attributes.
• Single-valued and multivalued attributes
▪ Example: multivalued attribute: phone_numbers
• Derived attributes
▪ Can be computed from other attributes
▪ Example: age, given date_of_birth
▪ Domain – the set of permitted values for each attribute

Database System Concepts - 7th Edition 6.21 ©Silberschatz, Korth and Sudarshan
Composite Attributes

▪ Composite attributes allow us to divided attributes into subparts (other


attributes).

composite name address


attributes

first_name middle_initial last_name street city state postal_code

component
attributes
street_number street_name apartment_number

Database System Concepts - 7th Edition 6.22 ©Silberschatz, Korth and Sudarshan
Representing Complex Attributes in ER Diagram

Database System Concepts - 7th Edition 6.23 ©Silberschatz, Korth and Sudarshan
Mapping Cardinality Constraints

▪ Express the number of entities to which another entity can be associated


via a relationship set.
▪ Most useful in describing binary relationship sets.
▪ For a binary relationship set the mapping cardinality must be one of the
following types:
• One to one
• One to many
• Many to one
• Many to many

Database System Concepts - 7th Edition 6.24 ©Silberschatz, Korth and Sudarshan
Mapping Cardinalities

One to one One to many

One-to-one An entity in A is associated with at most one entity in B, and an


entity in B is associated with at most one entity in A.

One-to-many. An entity in A is associated with any number (zero or more) of entities in B.


An entity in B, however, can be associated with at most one entity in A.

Note: Some elements in A and B may not be mapped to any elements in the other set.

Database System Concepts - 7th Edition 6.25 ©Silberschatz, Korth and Sudarshan
Mapping Cardinalities

Many to one Many to many


Many-to-one. An entity in A is associated with at most one entity in B. An entity in B,
however, can be associated with any number (zero or more) of entities in A.

Many-to-many. An entity in A is associated with any number (zero or more) of entities in B,


and an entity in B is associated with any number (zero or more) of entities in A.

Note: Some elements in A and B may not be mapped to any elements in the other set

Database System Concepts - 7th Edition 6.26 ©Silberschatz, Korth and Sudarshan
Representing Cardinality Constraints in ER Diagram

▪ We express cardinality constraints by drawing either a directed line (→),


signifying “one,” or an undirected line (—), signifying “many,” between the
relationship set and the entity set.

▪ One-to-one relationship between an instructor and a student :


• A student is associated with at most one instructor via the relationship
advisor
• A student is associated with at most one department via stud_dept

Database System Concepts - 7th Edition 6.27 ©Silberschatz, Korth and Sudarshan
One-to-Many Relationship

▪ one-to-many relationship between an instructor and a student


• an instructor is associated with several (including 0) students via
advisor
• a student is associated with at most one instructor via advisor,

Database System Concepts - 7th Edition 6.28 ©Silberschatz, Korth and Sudarshan
Many-to-One Relationships

▪ In a many-to-one relationship between an instructor and a student,


• an instructor is associated with at most one student via advisor,
• and a student is associated with several (including 0) instructors via
advisor

Database System Concepts - 7th Edition 6.29 ©Silberschatz, Korth and Sudarshan
Many-to-Many Relationship

▪ An instructor is associated with several (possibly 0) students via advisor


▪ A student is associated with several (possibly 0) instructors via advisor

Database System Concepts - 7th Edition 6.30 ©Silberschatz, Korth and Sudarshan
Total and Partial Participation

▪ Total participation (indicated by double line): every entity in the entity


set participates in at least one relationship in the relationship set

participation of student in advisor relation is total


▪ every student must have an associated instructor
▪ Partial participation: some entities may not participate in any
relationship in the relationship set
• Example: participation of instructor in advisor is partial

Database System Concepts - 7th Edition 6.31 ©Silberschatz, Korth and Sudarshan
Notation for Expressing More Complex Constraints

▪ A line may have an associated minimum and maximum cardinality, shown


in the form l..h, where l is the minimum and h the maximum cardinality
• A minimum value of 1 indicates total participation.
• A maximum value of 1 indicates that the entity participates in at most
one relationship
• A maximum value of * indicates no limit.
▪ Example

• Instructor can advise 0 or more students. A student must have 1


advisor; cannot have multiple advisors

Database System Concepts - 7th Edition 6.32 ©Silberschatz, Korth and Sudarshan
Cardinality Constraints on Ternary Relationship

▪ We allow at most one arrow out of a ternary (or greater degree)


relationship to indicate a cardinality constraint
▪ For example, an arrow from proj_guide to instructor indicates each
student has at most one guide for a project
▪ If there is more than one arrow, there are two ways of defining the
meaning.
• For example, a ternary relationship R between A, B and C with
arrows to B and C could mean
1. Each A entity is associated with a unique entity from B
and C or
2. Each pair of entities from (A, B) is associated with a
unique C entity, and each pair (A, C) is associated
with a unique B
• Each alternative has been used in different formalisms
• To avoid confusion we outlaw more than one arrow

Database System Concepts - 7th Edition 6.33 ©Silberschatz, Korth and Sudarshan
Primary Key

▪ Primary keys provide a way to specify how entities and relations are
distinguished. We will consider:
• Entity sets
• Relationship sets.
• Weak entity sets

Database System Concepts - 7th Edition 6.34 ©Silberschatz, Korth and Sudarshan
Primary key for Entity Sets

▪ By definition, individual entities are distinct.


▪ From database perspective, the differences among them must be
expressed in terms of their attributes.
▪ The values of the attribute values of an entity must be such that they can
uniquely identify the entity.
• No two entities in an entity set are allowed to have exactly the same
value for all attributes.
▪ A key for an entity is a set of attributes that suffice to distinguish entities
from each other

Database System Concepts - 7th Edition 6.35 ©Silberschatz, Korth and Sudarshan
Primary Key for Relationship Sets

▪ To distinguish among the various relationships of a relationship set we use


the individual primary keys of the entities in the relationship set.
• Let R be a relationship set involving entity sets E1, E2, .. En
• The primary key for R is consists of the union of the primary keys of
entity sets E1, E2, ..En
• If the relationship set R has attributes a1, a2, .., am associated with it,
then the primary key of R also includes the attributes a1, a2, .., am
▪ Example: relationship set “advisor”.
• The primary key consists of [Link] and [Link]
▪ The choice of the primary key for a relationship set depends on the
mapping cardinality of the relationship set.

Database System Concepts - 7th Edition 6.36 ©Silberschatz, Korth and Sudarshan
Choice of Primary key for Binary Relationship

▪ Many-to-Many relationships. The preceding union of the primary keys is a


minimal superkey and is chosen as the primary key.
▪ One-to-Many relationships . The primary key of the “Many” side is a
minimal superkey and is used as the primary key.
▪ Many-to-one relationships. The primary key of the “Many” side is a minimal
superkey and is used as the primary key.
▪ One-to-one relationships. The primary key of either one of the participating
entity sets forms a minimal superkey, and either one can be chosen as the
primary key.

Database System Concepts - 7th Edition 6.37 ©Silberschatz, Korth and Sudarshan
Weak Entity Sets

▪ Consider a section entity, which is uniquely identified by a course_id,


semester, year, and sec_id.
▪ Clearly, section entities are related to course entities. Suppose we create
a relationship set sec_course between entity sets section and course.
▪ Note that the information in sec_course is redundant, since section
already has an attribute course_id, which identifies the course with which
the section is related.
▪ One option to deal with this redundancy is to get rid of the relationship
sec_course; however, by doing so the relationship between section and
course becomes implicit in an attribute, which is not desirable.

Database System Concepts - 7th Edition 6.38 ©Silberschatz, Korth and Sudarshan
Weak Entity Sets (Cont.)

▪ An alternative way to deal with this redundancy is to not store the attribute
course_id in the section entity and to only store the remaining attributes
section_id, year, and semester.
• However, the entity set section then does not have enough attributes
to identify a particular section entity uniquely
▪ To deal with this problem, we treat the relationship sec_course as a
special relationship that provides extra information, in this case, the
course_id, required to identify section entities uniquely.
▪ A weak entity set is one whose existence is dependent on another entity,
called its identifying entity
▪ Instead of associating a primary key with a weak entity, we use the
identifying entity, along with extra attributes called discriminator to
uniquely identify a weak entity.

Database System Concepts - 7th Edition 6.39 ©Silberschatz, Korth and Sudarshan
Weak Entity Sets (Cont.)

▪ An entity set that is not a weak entity set is termed a strong entity set.
▪ Every weak entity must be associated with an identifying entity; that is,
the weak entity set is said to be existence dependent on the identifying
entity set.
▪ The identifying entity set is said to own the weak entity set that it
identifies.
▪ The relationship associating the weak entity set with the identifying entity
set is called the identifying relationship.
▪ Note that the relational schema we eventually create from the entity set
section does have the attribute course_id, for reasons that will become
clear later, even though we have dropped the attribute course_id from
the entity set section.

Database System Concepts - 7th Edition 6.40 ©Silberschatz, Korth and Sudarshan
Expressing Weak Entity Sets

▪ In E-R diagrams, a weak entity set is depicted via a double rectangle.


▪ We underline the discriminator of a weak entity set with a dashed line.
▪ The relationship set connecting the weak entity set to the identifying
strong entity set is depicted by a double diamond.
▪ Primary key for section – (course_id, sec_id, semester, year)

Database System Concepts - 7th Edition 6.41 ©Silberschatz, Korth and Sudarshan
Redundant Attributes

▪ Suppose we have entity sets:


• student, with attributes: ID, name, tot_cred, dept_name
• department, with attributes: dept_name, building, budget
▪ We model the fact that each student has an associated department using
a relationship set stud_dept
▪ The attribute dept_name in student below replicates information present
in the relationship and is therefore redundant
• and needs to be removed.
▪ BUT: when converting back to tables, in some cases the attribute gets
reintroduced, as we will see later.

Database System Concepts - 7th Edition 6.42 ©Silberschatz, Korth and Sudarshan
E-R Diagram for a University Enterprise

Database System Concepts - 7th Edition 6.43 ©Silberschatz, Korth and Sudarshan
Reduction to Relation Schemas

Database System Concepts - 7th Edition 6.44 ©Silberschatz, Korth and Sudarshan
Reduction to Relation Schemas

▪ Entity sets and relationship sets can be expressed uniformly as relation


schemas that represent the contents of the database.
▪ A database which conforms to an E-R diagram can be represented by a
collection of schemas.
▪ For each entity set and relationship set there is a unique schema that is
assigned the name of the corresponding entity set or relationship set.
▪ Each schema has a number of columns (generally corresponding to
attributes), which have unique names.

Database System Concepts - 7th Edition 6.45 ©Silberschatz, Korth and Sudarshan
Database System Concepts - 7th Edition 6.46 ©Silberschatz, Korth and Sudarshan
Representing Entity Sets

▪ A strong entity set reduces to a schema with the same attributes

student(ID, name, tot_cred)

▪ A weak entity set becomes a table that includes a column for the primary
key of the identifying strong entity set
section ( course_id, sec_id, sem, year )
▪ Example

Database System Concepts - 7th Edition 6.47 ©Silberschatz, Korth and Sudarshan
Representation of Entity Sets with Composite Attributes

▪ Composite attributes are flattened out by creating a


separate attribute for each component attribute
• Example: given entity set instructor with composite
attribute name with component attributes first_name
and last_name the schema corresponding to the
entity set has two attributes name_first_name and
name_last_name
▪ Prefix omitted if there is no ambiguity
(name_first_name could be first_name)
▪ Ignoring multivalued attributes, extended instructor
schema is
• instructor(ID,
first_name, middle_initial, last_name,
street_number, street_name,
apt_number, city, state, zip_code,
date_of_birth)

Database System Concepts - 7th Edition 6.48 ©Silberschatz, Korth and Sudarshan
Representation of Entity Sets with Multivalued Attributes

▪ A multivalued attribute M of an entity E is represented by a separate


schema E_M
▪ Schema E_M has attributes corresponding to the primary key of E and
an attribute corresponding to multivalued attribute M
▪ Example: Multivalued attribute phone_number of instructor is
represented by a schema:
inst_phone= ( ID, phone_number)
▪ Each value of the multivalued attribute maps to a separate tuple of
the relation on schema EM
• For example, an instructor entity with primary key 22222 and phone
numbers 456-7890 and 123-4567 maps to two tuples:
(22222, 456-7890) and (22222, 123-4567)

Database System Concepts - 7th Edition 6.49 ©Silberschatz, Korth and Sudarshan
Representation of Entity Sets with Multivalued Attributes

• We create a primary key of the relation schema consisting of all attributes of the
schema. In the above example, the primary key consists of both attributes of the relation
instructor phone.
• In addition, we create a foreign-key constraint on the relation schema created from the
multivalued attribute, with the attribute generated from the primary key of the entity set
referencing the relation generated from the entity set.

• In the case that an entity set consists of only two attributes—a single primary key
attribute B and a single multivalued attribute M—the relation schema for the
entity set would contain only one attribute, namely the primary-key attribute B.
We can drop this relation, while retaining the relation schema with the attribute
B and attribute A that corresponds to M.

Database System Concepts - 7th Edition 6.50 ©Silberschatz, Korth and Sudarshan
Representing Relationship Sets

▪ A many-to-many relationship set is represented as a schema with


attributes for the primary keys of the two participating entity sets, and
any descriptive attributes of the relationship set.
▪ Example: schema for relationship set advisor

advisor = (s_id, i_id)

Database System Concepts - 7th Edition 6.51 ©Silberschatz, Korth and Sudarshan
Redundancy of Schemas

▪ Many-to-one and one-to-many relationship sets that are total on the many-
side can be represented by adding an extra attribute to the “many” side,
containing the primary key of the “one” side.

• Example: Instead of creating a schema for relationship set inst_dept, add


an attribute dept_name to the schema arising from entity set instructor. The
resulting instructor schema consists of the attributes {ID, name, dept name,
salary}.

Database System Concepts - 7th Edition 6.52 ©Silberschatz, Korth and Sudarshan
Redundancy of Schemas (Cont.)

▪ For one-to-one relationship sets, either side can be chosen to act as the
“many” side
• That is, an extra attribute can be added to either of the tables
corresponding to the two entity sets.

▪ We can combine schemas even if the participation is partial by using null


values. In the above example, if inst_dept were partial, then we would
store null values for the dept name attribute for those instructors who
have no associated department.

Database System Concepts - 7th Edition 6.53 ©Silberschatz, Korth and Sudarshan
Redundancy of Schemas (Cont.)

▪ The schema corresponding to a relationship set linking a weak entity set


to its identifying strong entity set is redundant and does not need to be
present in the relational database design.
▪ Example: The section schema already contains the attributes that would
appear in the sec_course schema

Database System Concepts - 7th Edition 6.54 ©Silberschatz, Korth and Sudarshan
Extended E-R Features

Database System Concepts - 7th Edition 6.55 ©Silberschatz, Korth and Sudarshan
Specialization

▪ Top-down design process; we designate sub-groupings within an entity set


that are distinctive from other entities in the set.
▪ These sub-groupings become lower-level entity sets that have attributes or
participate in relationships that do not apply to the higher-level entity set.
▪ Depicted by a triangle component labeled ISA (e.g., instructor “is a”
person).
▪ Attribute inheritance – a lower-level entity set inherits all the attributes
and relationship participation of the higher-level entity set to which it is
linked.

Database System Concepts - 7th Edition 6.56 ©Silberschatz, Korth and Sudarshan
Specialization Example
▪ Overlapping – employee and student
▪ Disjoint – instructor and secretary
▪ Total and partial

Database System Concepts - 7th Edition 6.57 ©Silberschatz, Korth and Sudarshan
Representing Specialization via Schemas

▪ Method 1:
• Form a schema for the higher-level entity
• Form a schema for each lower-level entity set, include primary key
of higher-level entity set and local attributes

• Drawback: getting information about, an employee requires


accessing two relations, the one corresponding to the low-level
schema and the one corresponding to the high-level schema

Database System Concepts - 7th Edition 6.58 ©Silberschatz, Korth and Sudarshan
Representing Specialization as Schemas (Cont.)

▪ Method 2:
• Form a schema for each entity set with all local and inherited
attributes

• Drawback: name, street and city may be stored redundantly for


people who are both students and employees

Database System Concepts - 7th Edition 6.59 ©Silberschatz, Korth and Sudarshan
Generalization

▪ A bottom-up design process – combine a number of entity sets that


share the same features into a higher-level entity set.
▪ Specialization and generalization are simple inversions of each other;
they are represented in an E-R diagram in the same way.
▪ The terms specialization and generalization are used interchangeably.

Database System Concepts - 7th Edition 6.60 ©Silberschatz, Korth and Sudarshan
Completeness constraint

▪ Completeness constraint -- specifies whether or not an entity in the


higher-level entity set must belong to at least one of the lower-level
entity sets within a generalization.
• Total specialization or generalization: an entity must belong to
one of the lower-level entity sets
• Partial specialization or generalization : an entity need not
belong to one of the lower-level entity sets

Database System Concepts - 7th Edition 6.61 ©Silberschatz, Korth and Sudarshan
Completeness constraint (Cont.)

▪ Partial generalization is the default.


▪ We can specify total generalization in an ER diagram by adding the
keyword total in the diagram and drawing a dashed line from the
keyword to the corresponding hollow arrow-head to which it applies (for
a total generalization), or to the set of hollow arrow-heads to which it
applies (for an overlapping generalization).
▪ The student generalization is total: All student entities must be either
graduate or undergraduate. Because the higher-level entity set arrived
at through generalization is generally composed of only those entities
in the lower-level entity sets, the completeness constraint for a
generalized higher-level entity set is usually total

Database System Concepts - 7th Edition 6.62 ©Silberschatz, Korth and Sudarshan
Aggregation

▪ Consider the ternary relationship proj_guide, which we saw earlier


▪ Suppose we want to record evaluations of a student by a guide on a
project

Database System Concepts - 7th Edition 6.63 ©Silberschatz, Korth and Sudarshan
Aggregation (Cont.)

▪ Relationship sets eval_for and proj_guide represent overlapping


information
• Every eval_for relationship corresponds to a proj_guide relationship
• However, some proj_guide relationships may not correspond to any
eval_for relationships
▪ So we can’t discard the proj_guide relationship
▪ Eliminate this redundancy via aggregation
• Treat relationship as an abstract entity
• Allows relationships between relationships
• Abstraction of relationship into new entity

Database System Concepts - 7th Edition 6.64 ©Silberschatz, Korth and Sudarshan
Aggregation (Cont.)

▪ Eliminate this redundancy via aggregation without introducing


redundancy, the following diagram represents:
• A student is guided by a particular instructor on a particular project
• A student, instructor, project combination may have an associated
evaluation

Database System Concepts - 7th Edition 6.65 ©Silberschatz, Korth and Sudarshan
Reduction to Relational Schemas

▪ To represent aggregation, create a schema containing


• Primary key of the aggregated relationship,
• The primary key of the associated entity set
• Any descriptive attributes
▪ In our example:
• The schema eval_for is:
eval_for (s_ID, project_id, i_ID, evaluation_id)
• The schema proj_guide is redundant.

Database System Concepts - 7th Edition 6.66 ©Silberschatz, Korth and Sudarshan
Design Issues

Database System Concepts - 7th Edition 6.67 ©Silberschatz, Korth and Sudarshan
Common Mistakes in E-R Diagrams

▪ Example of erroneous E-R diagrams

Database System Concepts - 7th Edition 6.68 ©Silberschatz, Korth and Sudarshan
Common Mistakes in E-R Diagrams (Cont.)

Database System Concepts - 7th Edition 6.69 ©Silberschatz, Korth and Sudarshan
Entities vs. Attributes

▪ Use of entity sets vs. attributes

▪ Use of phone as an entity allows extra information about phone numbers


(plus multiple phone numbers)

Database System Concepts - 7th Edition 6.70 ©Silberschatz, Korth and Sudarshan
Entities vs. Relationship sets

▪ Use of entity sets vs. relationship sets


Possible guideline is to designate a relationship set to describe
an action that occurs between entities

▪ Placement of relationship attributes


For example, attribute date as attribute of advisor or as attribute
of student

Database System Concepts - 7th Edition 6.71 ©Silberschatz, Korth and Sudarshan
Binary Vs. Non-Binary Relationships

▪ Although it is possible to replace any non-binary (n-ary, for n > 2)


relationship set by a number of distinct binary relationship sets, a n-ary
relationship set shows more clearly that several entities participate in a
single relationship.
▪ Some relationships that appear to be non-binary may be better
represented using binary relationships
• For example, a ternary relationship parents, relating a child to
his/her father and mother, is best replaced by two binary
relationships, father and mother
▪ Using two binary relationships allows partial information (e.g.,
only mother being known)
• But there are some relationships that are naturally non-binary
▪ Example: proj_guide

Database System Concepts - 7th Edition 6.72 ©Silberschatz, Korth and Sudarshan
Converting Non-Binary Relationships to Binary Form

▪ In general, any non-binary relationship can be represented using binary


relationships by creating an artificial entity set.
• Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
1. RA, relating E and A 2. RB, relating E and B
3. RC, relating E and C
• Create an identifying attribute for E and add any attributes of R to E
• For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E 2. add (ei , ai ) to RA
3. add (ei , bi ) to RB 4. add (ei , ci ) to RC

Database System Concepts - 7th Edition 6.73 ©Silberschatz, Korth and Sudarshan
Converting Non-Binary Relationships (Cont.)

▪ Also need to translate constraints


• Translating all constraints may not be possible
• There may be instances in the translated schema that
cannot correspond to any instance of R
▪ Exercise: add constraints to the relationships RA, RB and RC to
ensure that a newly created entity corresponds to exactly one
entity in each of entity sets A, B and C
• We can avoid creating an identifying attribute by making E a weak
entity set (described shortly) identified by the three relationship sets

Database System Concepts - 7th Edition 6.74 ©Silberschatz, Korth and Sudarshan
E-R Design Decisions

▪ The use of an attribute or entity set to represent an object.


▪ Whether a real-world concept is best expressed by an entity set or a
relationship set.
▪ The use of a ternary relationship versus a pair of binary relationships.
▪ The use of a strong or weak entity set.
▪ The use of specialization/generalization – contributes to modularity in the
design.
▪ The use of aggregation – can treat the aggregate entity set as a single
unit without concern for the details of its internal structure.

Database System Concepts - 7th Edition 6.75 ©Silberschatz, Korth and Sudarshan
Summary of Symbols Used in E-R Notation

Database System Concepts - 7th Edition 6.76 ©Silberschatz, Korth and Sudarshan
Symbols Used in E-R Notation (Cont.)

Database System Concepts - 7th Edition 6.77 ©Silberschatz, Korth and Sudarshan
Alternative ER Notations

▪ Chen, IDE1FX, …

Database System Concepts - 7th Edition 6.78 ©Silberschatz, Korth and Sudarshan
Alternative ER Notations

Chen IDE1FX (Crows feet notation)

Database System Concepts - 7th Edition 6.79 ©Silberschatz, Korth and Sudarshan
UML

▪ UML: Unified Modeling Language


▪ UML has many components to graphically model different aspects of an
entire software system
▪ UML Class Diagrams correspond to E-R Diagram, but several
differences.

Database System Concepts - 7th Edition 6.80 ©Silberschatz, Korth and Sudarshan
ER vs. UML Class Diagrams

* Note reversal of position in cardinality constraint depiction

Database System Concepts - 7th Edition 6.81 ©Silberschatz, Korth and Sudarshan
ER vs. UML Class Diagrams
ER Diagram Notation Equivalent in UML

* Generalization can use merged or separate arrows independent


of disjoint/overlapping

Database System Concepts - 7th Edition 6.82 ©Silberschatz, Korth and Sudarshan
UML Class Diagrams (Cont.)

▪ Binary relationship sets are represented in UML by just drawing a line


connecting the entity sets. The relationship set name is written adjacent
to the line.
▪ The role played by an entity set in a relationship set may also be
specified by writing the role name on the line, adjacent to the entity set.
▪ The relationship set name may alternatively be written in a box, along
with attributes of the relationship set, and the box is connected, using a
dotted line, to the line depicting the relationship set.

Database System Concepts - 7th Edition 6.83 ©Silberschatz, Korth and Sudarshan
ER vs. UML Class Diagrams

Database System Concepts - 7th Edition 6.84 ©Silberschatz, Korth and Sudarshan
Other Aspects of Database Design

▪ Functional Requirements
▪ Data Flow, Workflow
▪ Schema Evolution

Database System Concepts - 7th Edition 6.85 ©Silberschatz, Korth and Sudarshan
End of Chapter 6

Database System Concepts - 7th Edition 6.86 ©Silberschatz, Korth and Sudarshan

You might also like