0% found this document useful (0 votes)
4 views

DBMS_Unit_1

The document outlines the syllabus for a Database Management System course, detailing five units covering topics such as data models, relational database design, SQL, transaction processing, and normalization. It includes course learning outcomes, assessment methods, and references to textbooks and online resources. Additionally, it discusses the advantages of using a DBMS, historical development, and types of data models.

Uploaded by

476cs19056.rp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

DBMS_Unit_1

The document outlines the syllabus for a Database Management System course, detailing five units covering topics such as data models, relational database design, SQL, transaction processing, and normalization. It includes course learning outcomes, assessment methods, and references to textbooks and online resources. Additionally, it discusses the advantages of using a DBMS, historical development, and types of data models.

Uploaded by

476cs19056.rp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 145

DATABASE MANAGEMENT SYSTEM

22ISG62, 6th Semester


Department of Information Science and Engineering
NMIT, Bangalore

Priyadarshini R,
Assistant Professor, NMIT.
Syllabus
UNIT -1- (8 Hrs.) Advantages of using DBMS approach; Data models, schemas and instances;
DBMS component module Three- schema architecture and data independence; Database
languages; Classification of Database Management systems Using High-Level Conceptual Data
Models for Database Design; An Example Database Application; Entity Types, Entity Sets,
Attributes and Keys; Relationship types, Relationship Sets, Roles and Structural Constraints;Weak
Entity Types; ER Diagrams, Naming Conventions and Design Issues; Relationship types of degree
higher than two.
UNIT -2- (9 Hrs.) Relational database design using ER to relational mapping; Relational Model
Concepts; Relational Model Constraints and Relational Database Schemas; Update Operations,
Transactions and dealing with constraint violations;Unary Relational Operations: SELECT and
PROJECT; Relational Algebra Operations from Set Theory; Binary Relational Operations: JOIN
and DIVISION; Additional Relational Operations; Examples of Queries in Relational Algebra;

2
Syllabus
UNIT -3- (7 Hrs.) Informal Design Guidelines for Relation Schemas; Functional Dependencies:
Definition of FD, Inference rules for FD; Normal Forms Based on Primary Keys; General Definitions of
Second and Third Normal Forms; Boyce-Codd Normal Form. Multi-valued Dependencies and Fourth
Normal Form ; Join Dependencies .
UNIT -4- (8 Hrs.) SQL Data Definition and Data Types; Specifying basic constraints in SQL; Schema
change statements in SQL;Basic queries in SQL; More complex SQL Queries .Insert, Delete and Update
statements in SQL; Specifying constraints as Assertion and Trigger; Views (Virtual Tables) in SQL .
UNIT -5- (7 Hrs.) Introduction to transaction processing: Transaction and system concepts; Desirable
Properties of transactions; Transactions and Schedules; Characterizing schedules based on recoverability;
Characterizing schedules based on Serializability; Concurrency Control Techniques: 2PL techniques for
concurrency control;

3
COURSE LEARNING OUTCOMES

1. Describe the fundamentals of relational database concepts.


2. Design ER diagram for the given requirement specification.
3. Apply Normalization concept to eliminate anomalies and achieve consistent Database.
4. Design Relational algebra and SQL Queries for the given schema.
5. Describe the properties of database transactions and concurrency control Techniques.
6. Develop Relational Database application

4
Text Book and Reference Books

1. All Fundamentals of Database Systems Elmasri and Navathe McGraw-Hill 7th Edition, 2017 .
2. All Database Raghu McGraw-Hill 3rd Edition,2003
3. Database System Concepts Silberschatz, Korth and Sudharshan Mc- GrawHill 7th Edition,2019

ONLINE RESOURCES (Links to MOOCS, NPTEL, MIT COURSEWARE etc) - Database


Management System - Course (nptel.ac.in)

5
COURSE ASSESSMENT METHOD:

● LA1 - Quiz (10 Marks)


● LA2 - Infosys Springboard course (10 Marks)
● Two MSEs for 50 Marks
● Final examination, of 100 Marks will be conducted and will be evaluated for 50 Marks.

6
Unit 1
Advantages of using DBMS approach; Data models, schemas and instances;
DBMS component module Three- schema architecture and data
independence; Database languages; Classification of Database Management
systems Using High-Level Conceptual Data Models for Database Design; An
Example Database Application; Entity Types, Entity Sets, Attributes and
Keys; Relationship types, Relationship Sets, Roles and Structural
Constraints;Weak Entity Types; ER Diagrams, Naming Conventions and
Design Issues; Relationship types of degree higher than two.

7
BASIC DEFINITIONS
● Data:
○ Known facts that can be recorded and have an implicit meaning.
● Database:
○ A collection of related data.
● Mini-world:
○ Some part of the real world about which data is stored in a database. For
example, student grades and transcripts at a university.
● Database Management System (DBMS):
○ A software package/ system to facilitate the creation and maintenance of a
computerized database.
● Database System:
○ The DBMS software together with the data itself. Sometimes, the applications
are also included.
SIMPLIFIED DATABASE SYSTEM ENVIRONMENT
TYPICAL DBMS FUNCTIONALITY
Define

Construct

Manipulating the database:

● Retrieval
● Modification
● Accessing the database through Web applications

Processing and Sharing


TYPICAL DBMS FUNCTIONALITY
Other features:

● Protection or Security measures to prevent unauthorized access


● “Active” processing to take internal actions on data
● Presentation and Visualization of data
● Maintaining the database and associated programs over the lifetime of the
database application called database, software, and system maintenance
EXAMPLE OF A DATABASE
(WITH A CONCEPTUAL DATA MODEL)
● Mini-world for the example:
○ Part of a UNIVERSITY environment.
● Some mini-world entities:
○ STUDENTs
○ COURSEs
○ SECTIONs (of COURSEs)
○ (academic) DEPARTMENTs
○ INSTRUCTORs
EXAMPLE OF A DATABASE
(WITH A CONCEPTUAL DATA MODEL)
● Some mini-world relationships:

○ SECTIONs are of specific COURSEs

○ STUDENTs take SECTIONs

○ COURSEs have prerequisite COURSEs

○ INSTRUCTORs teach SECTIONs

○ COURSEs are offered by DEPARTMENTs

○ STUDENTs major in DEPARTMENTs


EXAMPLE OF A SIMPLE DATABASE
MAIN CHARACTERISTICS OF THE DATABASE
APPROACH
● Self-describing nature of a database system:
○ A DBMS catalog stores the description
○ Meta-data.
○ This allows the DBMS software to work with different database applications.
● Insulation between programs and data:
○ Called program-data independence.
○ Allows changes.
EXAMPLE OF A SIMPLIFIED DATABASE CATALOG
MAIN CHARACTERISTICS OF THE DATABASE
APPROACH (CONTINUED)
● Data Abstraction:
○ A data model is used to hide storage details and present the users with
a conceptual view of the database.
○ Programs refer to the data model constructs rather than data storage
details.
● Support of multiple views of the data:
○ Each user may see a different view of the database, which describes
only the data of interest to that user.
MAIN CHARACTERISTICS OF THE DATABASE
APPROACH (CONTINUED)
Sharing of data and multi-user transaction processing:

● Allowing a set of concurrent users to retrieve from and to update the database.
● Concurrency control
● Recovery
● OLTP (Online Transaction Processing)
DATABASE USERS
Users may be divided into
● Actors on the Scene- who control - Database administrators, End user
● Workers Behind the Scene- designer- Designers (Maintenance personal, System
Developer)
CATEGORIES OF END-USERS
ADVANTAGES OF USING THE DATABASE
APPROACH
● Data Redundancy Control - Centralizing data storage
● Data Integrity - Ensuring that data is accurate, consistent, and reliable
● Data Security - Access control and user authentication
● Concurrent Access - Access and manipulate data simultaneously
● Backup and Recovery
● Efficient Data Access - Optimized query processing techniques
● Data Independence - Changes in the storage structure do not affect the application
● Centralized Data Management
ADVANTAGES OF USING THE DATABASE
APPROACH (CONTINUED)

● Reduced Application Development Time


● Improved Decision Making
● Data Sharing
● Scalability
ADDITIONAL IMPLICATIONS OF USING THE
DATABASE APPROACH (CONTINUED)
● Flexibility to change data structures:
○ Database structure may evolve as new requirements are defined.
● Availability of current information:
○ Extremely important for online transaction systems such as airline, hotel, car
reservations.
● Economies of scale:
○ Wasteful overlap of resources and personnel can be avoided by consolidating data
and applications across departments.
● Potential for enforcing standards
HISTORICAL DEVELOPMENT OF DATABASE
TECHNOLOGY
● Early Database Applications:
○ The Hierarchical and Network Models were introduced in mid 1960s and dominated during
the seventies.
○ A bulk of the worldwide database processing still occurs using these models, particularly, the
hierarchical model.
● Relational Model based Systems:
○ Relational model was originally introduced in 1970, was heavily researched and
experimented within IBM Research and several universities.
○ Relational DBMS Products emerged in the early 1980s
HISTORICAL DEVELOPMENT OF DATABASE
TECHNOLOGY (CONTINUED)
● Object-oriented and emerging applications:
○ Object-Oriented Database Management Systems (OODBMSs) were introduced in
late 1980s and early 1990s to cater to the need of complex data processing in CAD
and other applications.
■ Their use has not taken off much.
○ Many relational DBMSs have incorporated object database concepts, leading to a
new category called object-relational DBMSs (ORDBMSs)
○ Extended relational systems add further capabilities (e.g. for multimedia data, XML,
and other data types)
HISTORICAL DEVELOPMENT OF DATABASE
TECHNOLOGY (CONTINUED)
● Data on the Web and E-commerce Applications:

○ Web contains data in HTML (Hypertext markup language) with links among pages.

○ This has given rise to a new set of applications and E-commerce is using new
standards like XML (eXtended Markup Language). (see Ch. 27).

○ Script programming languages such as PHP and JavaScript allow generation of


dynamic Web pages that are partially generated from a database (see Ch. 26).
Also allow database updates through Web pages
EXTENDING DATABASE CAPABILITIES
New functionality is being added to DBMS’s in the following areas:
● Scientific Applications
● XML (eXtensible Markup Language)
● Image Storage and Management
● Audio and Video Data Management
● Data Warehousing and Data Mining
● Spatial Data Management
● Time Series and Historical Data Management
● The above gives rise to new research and development in incorporating new data types,
complex data structures, new operations and storage and indexing schemes in database
systems.
WHEN NOT TO USE A DBMS
Main inhibitors (costs) of using a DBMS:
● High initial investment and possible need for additional hardware.
● Overhead for providing generality, security, concurrency control, recovery, and integrity
functions.
● When a DBMS may be unnecessary:
● If the database and applications are simple, well defined, and not expected to change.
● If there are stringent real-time requirements that may not be met because of DBMS
overhead.
● If access to data by multiple users is not required.
When no DBMS may suffice:
● If the database system is not able to handle the complexity of data because of modeling
limitations
● If the database users need special operations not supported by the DBMS.
SUMMARY
● Types of Databases and Database Applications
● Basic Definitions
● Typical DBMS Functionality
● Example of a Database (UNIVERSITY)
● Main Characteristics of the Database Approach
● Database Users
● Advantages of Using the Database Approach
● When Not to Use Databases
Data Models in DBMS
● Definition: A data model is a conceptual framework used to describe the structure,
relationships, and constraints of data within a database.
● Purpose: Helps to define how data is stored, accessed, and manipulated within the
database.

30
31
Types of Data Models
Key Types:

● Hierarchical Model
● Network Model
● Relational Model
● Object-Oriented Model
● Entity-Relationship Model (ER Model)

32
Hierarchical Data Model
Data is organized in a tree-like structure,
with a single root and child nodes.
○ Parent-child relationship
○ Each child can have only one
parent
○ Example: XML data
representation
● Pros:
○ Simple to understand and use
○ Fast access for hierarchical
queries
● Cons:
○ Limited flexibility for complex
relationships
33
Network Data Model
Data is represented using a graph, where entities are nodes and relationships are edges

○ Supports more complex relationships (many-to-many)


○ Entities can have multiple parent entities
● Pros:
○ Flexible and allows complex relationships
● Cons:
○ Complex to design and maintain

34
Relational Data Model
Data is organized in tables (relations), where each table has rows (tuples) and
columns (attributes).
● Key Points:
○ Each table represents an entity type
○ Uses Primary Keys to uniquely identify rows
○ Foreign Keys represent relationships between tables
● Pros:
○ Simple and intuitive
○ Easily scalable and flexible
○ Supports SQL for querying
● Cons:
○ Can become inefficient with
very large datasets

35
Object-Oriented Data Model
Data is represented as objects, similar to object-oriented programming.

○ Supports classes, inheritance, and polymorphism


○ Objects encapsulate both data and behavior
● Pros:
○ Better suited for complex data types and relationships
○ Allows for data reusability
● Cons:
○ Complex to implement and manage
○ Performance can be impacted

with large datasets

36
Entity-Relationship (ER) Model
The ER model is a high-level conceptual framework for database design, focusing
on entities and their relationships.
○ Entities represent objects or concepts (e.g., Student, Course)
○ Relationships represent how entities are related (e.g., enrolls in)
○ Attributes define properties of entities (e.g., name, age)
● Pros:
○ Easy to understand and visually represent
○ Ideal for database design
● Cons:
○ Does not directly implement database functionality

37
ERD:

38
SCHEMAS VERSUS INSTANCES
A schema is the logical structure or design of a database.
It defines how the data is organized, including tables, relationships, and constraints,
without holding any actual data.
○ Represents the blueprint or framework for the database.
○ Describes tables, fields, relationships, and constraints (primary keys, foreign
keys).
○ Typically does not change frequently.

● Example: The schema might define a "Student" table with fields like Student_ID,
Name, DOB, and Grade.
SCHEMAS VERSUS INSTANCES
Types of Schemas

Physical Schema:

○ Describes the physical storage of data in the database, including how data is stored on the
hardware (e.g., storage devices, indexing, file structures).
○ Concerned with performance optimization and storage allocation.

Logical Schema:

○ Describes the logical view of the entire database, focusing on what data is stored and how it
is organized (tables, relationships, constraints).
○ It does not consider physical storage.

View Schema:

○ Represents the user-specific view of the database.


○ Virtual schemas that define how data is presented or accessed, often based on queries.
Instance

An instance refers to the actual data stored in the database at any given moment.
It is a snapshot of the data that adheres to the schema structure.

○ Represents the current state of data within the database.


○ Changes dynamically as data is inserted, updated, or deleted.
○ Data values (actual rows in tables) make up the database instance.

41
Relationship Between Schema and
Instance

● Schema defines the structure and organization of the database.


● Instance is the current data stored in the database, conforming to the
schema.
● Physical Schema: Focuses on how data is stored physically.
● Logical Schema: Describes how data is organized logically.
● View Schema: Provides user-specific views of the data.
● Schema is also called intension.
● State is also called extension.

42
Schema vs Instance – Example

Schema Example:

● Student Table:
○ Columns: Student_ID, Name, Age, Department
○ Constraints: Student_ID is a primary key.

Instance Example:

● Student Table Instance:


○ Row 1: Student_ID = 101, Name = John, Age = 20, Department = Computer Science
○ Row 2: Student_ID = 102, Name = Jane, Age = 22, Department = Electrical Engineering

43
DATABASE SCHEMA VS. DATABASE STATE
Database State:
● Refers to the content of a database at a moment in time.
Initial Database State:
● Refers to the database state when it is initially loaded into the system.
Valid State:
● A state that satisfies the structure and constraints of the database.
Distinction:
● The database schema does not change frequently.
● The database state changes every time the database is updated.
EXAMPLE OF A DATABASE SCHEMA
EXAMPLE OF A DATABASE STATE
THREE-SCHEMA ARCHITECTURE
● Proposed to support DBMS characteristics of:
○ Program-data independence.
○ Support of multiple views of the data.
● Not explicitly used in commercial DBMS products, but has been useful in
explaining database system organization
THREE-SCHEMA ARCHITECTURE
Defines DBMS schemas at three levels:
● Internal schema at the internal level to describe physical storage structures and access
paths (e.g indexes).
○ Typically uses a physical data model.
● Conceptual schema at the conceptual level to describe the structure and constraints for
the whole database for a community of users.
○ Uses a conceptual or an implementation data model.
● External schemas at the external level to describe the various user views.
○ Usually uses the same data model as the conceptual schema.
THE THREE-SCHEMA ARCHITECTURE
THREE-SCHEMA ARCHITECTURE
● Mappings among schema levels are needed to transform requests and
data.
○ Programs refer to an external schema, and are mapped by the
DBMS to the internal schema for execution.
○ Data extracted from the internal DBMS level is reformatted to
match the user’s external view (e.g. formatting the results of an
SQL query for display in a Web page)
DATA INDEPENDENCE
● Logical Data Independence:
○ The capacity to change the conceptual schema without having to change the
external schemas and their associated application programs.
● Physical Data Independence:
○ The capacity to change the internal schema without having to change the
conceptual schema.
○ For example, the internal schema may be changed when certain file
structures are reorganized or new indexes are created to improve database
performance
DATA INDEPENDENCE (CONTINUED)
● When a schema at a lower level is changed, only the mappings between this schema
and higher-level schemas need to be changed in a DBMS that fully supports data
independence.
● The higher-level schemas themselves are unchanged.
● Hence, the application programs need not be changed since they refer to the external
schemas.
DBMS LANGUAGES
● Data Definition Language (DDL)
- CREATE, ALTER, DROP, TRUNCATE, RENAME

● Data Manipulation Language (DML) - INSERT, UPDATE, DELETE, SELECT


○ High-Level or Non-procedural Languages: These include the relational
language SQL
■ May be used in a standalone way or may be embedded in a programming
language
○ Low Level or Procedural Languages:
■ These must be embedded in a programming language
DBMS LANGUAGES
● Data Definition Language (DDL):
○ Used by the DBA and database designers to specify the conceptual
schema of a database.
○ In many DBMSs, the DDL is also used to define internal and external
schemas (views).
○ In some DBMSs, separate storage definition language (SDL) and view
definition language (VDL) are used to define internal and external
schemas.
■ SDL is typically realized via DBMS commands provided to the DBA
and database designers
DBMS LANGUAGES
● Data Manipulation Language (DML):
○ Used to specify database retrievals and updates
○ DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C, C++, or Java.
■ A library of functions can also be provided to access the DBMS from a
programming language
○ Alternatively, stand-alone DML commands can be applied directly (called a
query language).
TYPES OF DML
● High Level or Non-procedural Language:
○ For example, the SQL relational language
○ Are “set”-oriented and specify what data to retrieve rather than how to retrieve
it.
○ Also called declarative languages.
● Low Level or Procedural Language:
○ Retrieve data one record-at-a-time;
○ Constructs such as looping are needed to retrieve multiple records, along with
positioning pointers.
DBMS INTERFACES
● Stand-alone query language interfaces
○ Example: Entering SQL queries at the DBMS interactive SQL interface (e.g.
SQL*Plus in ORACLE)
● Programmer interfaces for embedding DML in programming languages
● User-friendly interfaces
○ Menu-based, forms-based, graphics-based, etc.
DBMS PROGRAMMING LANGUAGE
INTERFACES
● Programmer interfaces for embedding DML in a programming
languages:
○ Embedded Approach: e.g embedded SQL (for C, C++, etc.),
SQLJ (for Java)
○ Procedure Call Approach: e.g. JDBC for Java, ODBC for other
programming languages
○ Database Programming Language Approach: e.g. ORACLE has
PL/SQL, a programming language based on SQL; language
incorporates SQL and its data types as integral components
USER-FRIENDLY DBMS INTERFACES
● Menu-based, popular for browsing on the web
● Forms-based, designed for naïve users
● Graphics-based
○ (Point and Click, Drag and Drop, etc.)
● Natural language: requests in written English
● Combinations of the above:
○ For example, both menus and forms used extensively in Web
database interfaces
OTHER DBMS INTERFACES
● Speech as Input and Output
● Web Browser as an interface
● Parametric interfaces, e.g., bank tellers using function keys.
● Interfaces for the DBA:
○ Creating user accounts, granting authorizations
○ Setting system parameters
○ Changing schemas or access paths
TYPICAL DBMS COMPONENT MODULES
DATABASE SYSTEM UTILITIES
● To perform certain functions such as:
○ Loading data stored in files into a database - reformats.
○ Backing - Restore
○ Reorganizing database file structures - improve performance.
○ Report generation utilities.
○ Performance monitoring utilities - complete statistics.
○ Other functions, such as sorting, user monitoring, data
compression, etc.
OTHER TOOLS
● Data dictionary / repository:
○ Used to store schema descriptions and other information such as design decisions,
application program descriptions, user information, usage standards, etc.
○ Active data dictionary is accessed by DBMS software and users/DBA.
○ Passive data dictionary is accessed by users/DBA only.
● Application Development Environments and CASE (computer-aided software engineering)
tools: used for database design.
● Examples:
○ PowerBuilder (Sybase)
○ JBuilder (Borland)
○ JDeveloper 10G (Oracle)
CLIENTS
● Provide appropriate interfaces through a client software module to
access and utilize the various server resources.
● Clients may be diskless machines or PCs or Workstations with disks
with only the client software installed.
● Connected to the servers via some form of a network.
○ (LAN: local area network, wireless network, etc.)
DBMS SERVER
● Provides database query and transaction services to the clients
● Relational DBMS servers are often called SQL servers, query servers, or transaction
servers
● Applications running on clients utilize an Application Program Interface (API) to access
server databases via standard interface such as:
○ ODBC: Open Database Connectivity standard
○ JDBC: for Java programming access
● Client and server must install appropriate client module and server module software for
ODBC or JDBC
CENTRALIZED AND
CLIENT-SERVER DBMS ARCHITECTURES

● Centralized DBMS:
○ Combines everything into single system including- DBMS
software, hardware, application programs, and user interface
processing software.
○ User can still connect through a remote terminal – however, all
processing is done at centralized site.
A PHYSICAL CENTRALIZED ARCHITECTURE
BASIC 2-TIER CLIENT-SERVER ARCHITECTURES
● Specialized Servers with Specialized functions
○ Print server
○ File server
○ DBMS server
○ Web server
○ Email server
● Clients can access the specialized servers as needed
LOGICAL TWO-TIER CLIENT SERVER
ARCHITECTURE
TWO TIER CLIENT-SERVER ARCHITECTURE
● A client program may connect to several DBMSs, sometimes called the
data sources.
● In general, data sources can be files or other non-DBMS software that
manages data.
● Other variations of clients are possible: e.g., in some object DBMSs,
more functionality is transferred to clients including data dictionary
functions, optimization and recovery across multiple servers, etc.
THREE TIER CLIENT-SERVER ARCHITECTURE
● Common for Web applications
● Intermediate Layer called Application Server or Web Server:
○ Stores the web connectivity software and the business logic part of
the application used to access the corresponding data from the
database server
○ Acts like a conduit for sending partially processed data between the
database server and the client.
● Three-tier Architecture Can Enhance Security:
○ Database server only accessible via middle tier
○ Clients cannot directly access database server
THREE-TIER CLIENT-SERVER ARCHITECTURE
CLASSIFICATION OF DBMSS
● Based on the data model used
○ Traditional: Relational, Network, Hierarchical.
○ Emerging: Object-oriented, Object-relational.
● Other classifications
○ Single-user (typically used with personal computers)
vs. multi-user (most DBMSs).
○ Centralized (uses a single computer with one database)
vs. distributed (uses multiple computers, multiple databases)
VARIATIONS OF DISTRIBUTED DBMS (DDBMS)

● Homogeneous DDBMS - Identical DB software.


● Heterogeneous DDBMS - Many DB No idea
● Federated or Multidatabase Systems
● Distributed Database Systems
client-server based database systems
COST CONSIDERATIONS FOR DBMSS
● Cost Range: from free open-source systems to configurations costing millions of
dollars
● Examples of free relational DBMSs: MySQL, PostgreSQL, others
● Commercial DBMS offer additional specialized modules, e.g. time-series module,
spatial data module, document module, XML module
○ These offer additional specialized functionality when purchased separately
○ Sometimes called cartridges (e.g., in Oracle) or blades
● Different licensing options: site license, maximum number of concurrent users (seat
license), single user, etc.
OVERVIEW OF DATABASE DESIGN PROCESS
● Two main activities:
○ Database design
○ Applications design
● Database design
○ To design the conceptual schema for a database application
● Applications design focuses on the programs and interfaces that access the database
○ Generally considered part of software engineering
OVERVIEW OF DATABASE DESIGN
PROCESS
EXAMPLE COMPANY DATABASE
● We need to create a database schema design based on the following (simplified)
requirements of the COMPANY Database:
○ The company is organized into DEPARTMENTs.
○ Each department has a name, number and an employee who manages the
department.
○ We keep track of the start date of the department manager. A department may
have several locations.
○ Each department controls a number of PROJECTs. Each project has a unique
name, unique number and is located at a single location.
EXAMPLE COMPANY DATABASE (CONTD.)
● We store each EMPLOYEE’s social security number, address, salary, gender, and
birthdate.
○ Each employee works for one department but may work on several projects.
○ We keep track of the number of hours per week that an employee currently
works on each project.
○ We also keep track of the direct supervisor of each employee.
● Each employee may have a number of DEPENDENTs.
○ For each dependent, we keep track of their name, gender, birthdate, and
relationship to the employee.
ER MODEL CONCEPTS
● Entities and Attributes
○ Entities are specific objects or things in the mini-world that are represented in the database.
■ For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX
PROJECT
○ Attributes are properties used to describe an entity.
■ For example an EMPLOYEE entity may have the attributes Name, SSN, Address,
gender, BirthDate
○ A specific entity will have a value for each of its attributes.
■ For example a specific employee entity may have Name='John Smith',
SSN='123456789', Address ='731, Fondren, Houston, TX', gender='M',
BirthDate='09-JAN-55‘
○ Each attribute has a value set (or data type) associated with it – e.g. integer, string, subrange,
enumerated type.
81
Types of Entity
1. Strong
2. Weak
3. Associative
4. Generalized
5. Specialized

82
Attributes

83
Types of attribute:
1. Simple : A simple attribute is an attribute that cannot be divided into smaller sub-parts.
It's atomic and represents a single value.
Example: Age, Gender, Date of Birth, Student_ID (these cannot be divided further
into meaningful subparts).
2. Composite Attribute: A composite attribute can be divided into smaller sub-attributes,
each of which has its own meaning.
Example: Name is a composite attribute because it can be divided into:
○ First Name
○ Last Name
○ Middle Name (if applicable)

84
Contd.,

3. Derived Attribute : Can be derived or calculated from other attributes.


Eg: Age (from Date of Birth)
4. Multivalued Attribute: Can have multiple values.
Eg: Phone Numbers (can have more than one)
5. Key Attribute: Uniquely identifies an entity.
Eg: Student_ID, Employee_ID

85
TYPES OF ATTRIBUTES (1)
● Simple
● Derived
● Key
○ Each entity has a single atomic value for the attribute. For example, SSN or gender.
● Composite
○ The attribute may be composed of several components. For example:
■ Address(Apt#, House#, Street, City, State, ZipCode, Country), or
■ Name(FirstName, MiddleName, LastName).
■ Composition may form a hierarchy where some components are themselves composite.
● Multi-valued
○ An entity may have multiple values for that attribute. For example, Color of a CAR or
PreviousDegrees of a STUDENT.
■ Denoted as {Color} or {PreviousDegrees}
TYPES OF ATTRIBUTES (2)
● In general, composite and multi-valued attributes may be nested arbitrarily to any
number of levels, although this is rare.
○ For example, PreviousDegrees of a STUDENT is a composite multi-valued
attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}
○ Multiple PreviousDegrees values can exist
○ Each has four subcomponent attributes:
■ College, Year, Degree, Field
EXAMPLE OF A COMPOSITE
ATTRIBUTE
ENTITY TYPES AND KEY ATTRIBUTES (1)
● Entities with the same basic attributes are grouped or typed into an entity type.
○ For example, the entity type EMPLOYEE and PROJECT.
● An attribute of an entity type for which each entity must have a unique value is called a key
attribute of the entity type.
○ For example, SSN of EMPLOYEE.
● A key attribute may be composite.
○ VehicleTagNumber is a key of the CAR entity type with components (Number, State).
● An entity type may have more than one key.
○ The CAR entity type may have two keys:
■ VehicleIdentificationNumber (popularly called VIN)
■ VehicleTagNumber (Number, State), aka license plate number.
● Each key is underlined
DISPLAYING AN ENTITY TYPE
● In ER diagrams, an entity type is displayed in a rectangular box
● Attributes are displayed in ovals
○ Each attribute is connected to its entity type
○ Components of a composite attribute are connected to the oval representing the
composite attribute
○ Each key attribute is underlined
○ Multivalued attributes displayed in double ovals
ENTITY TYPE CAR WITH TWO KEYS AND A CORRESPONDING ENTITY SET
ENTITY SET
● Each entity type will have a collection of entities stored in the database
○ Entity set
● Previous slide shows three CAR entity instances in the entity set for CAR
● Same name (CAR) used to refer to both the entity type and the entity set
● Entity set is the current state of the entities of that type that are stored in the
database
INITIAL DESIGN OF ENTITY TYPES FOR THE COMPANY
DATABASE SCHEMA
● Based on the requirements, we can identify four initial entity types in the COMPANY
database:
○ DEPARTMENT
○ PROJECT
○ EMPLOYEE
○ DEPENDENT
● Their initial design is shown on the following slide
● The initial attributes shown are derived from the requirements description
INITIAL DESIGN OF ENTITY TYPES:
EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT
REFINING THE INITIAL DESIGN BY INTRODUCING
RELATIONSHIPS
● The initial design is typically not complete
● Some aspects in the requirements will be represented as relationships
● ER model has three main concepts:
○ Entities (and their entity types and entity sets)
○ Attributes (simple, composite, multivalued)
○ Relationships (and their relationship types and relationship sets)
RELATIONSHIPS AND RELATIONSHIP TYPES (1)
✔ A relationship relates two or more distinct entities with a specific meaning.
✔ For example, EMPLOYEE John Smith works on the ProductX PROJECT, or
EMPLOYEE Franklin Wong manages the Research DEPARTMENT.
✔ Relationships of the same type are grouped or typed into a relationship type.
✔ For example, the WORKS_ON relationship type in which EMPLOYEEs and
PROJECTs participate, or the MANAGES relationship type in which EMPLOYEEs and
DEPARTMENTs participate.
✔ The degree of a relationship type is the number of participating entity types.
✔ Both MANAGES and WORKS_ON are binary relationships.
RELATIONSHIP INSTANCES OF THE WORKS_FOR N:1 RELATIONSHIP BETWEEN
EMPLOYEE AND DEPARTMENT
RELATIONSHIP INSTANCES OF THE M:N WORKS_ON
RELATIONSHIP BETWEEN EMPLOYEE AND PROJECT
RELATIONSHIP TYPE VS. RELATIONSHIP SET (1)
● Relationship Type:
○ Is the schema description of a relationship
○ Identifies the relationship name and the participating entity types
○ Also identifies certain relationship constraints
● Relationship Set:
○ The current set of relationship instances represented in the database
○ The current state of a relationship type
● Previous figures displayed the relationship sets
● Each instance in the set relates individual participating entities – one from each participating entity
type
● In ER diagrams, we represent the relationship type as follows:
○ Diamond-shaped box is used to display a relationship type
○ Connected to the participating entity types via straight lines
Using High-Level Conceptual Data Models for
Database Design
● Early stages of database design - information requirements of a system without
getting into the technical details of implementation.
● These models provide a more abstract representation of how data is structured and
related.
● High-level conceptual data model can be used to design a database that will later be
converted into a logical and physical model.
● Example: Entity-Relationship (ER) Models, Unified Modeling Language (UML),
and Object-Oriented Models.

100
Steps in Using High-Level Conceptual Data
Models for Database Design
● Requirement Analysis
● Identify Entities
● Define Relationships
● Define Primary Keys
● Add Constraints

101
An Example Database Application: Online Bookstore
Identify Entities:
● Book: Represents a book in the store.
○ Attributes: BookID (Primary Key), Title, ISBN, Price, PublicationYear
● Author: Represents an author who has written books.
○ Attributes: AuthorID (Primary Key), FirstName, LastName, BirthYear
● Customer: Represents a customer who makes orders.
○ Attributes: CustomerID (Primary Key), FirstName, LastName, Email, Phone
● Order: Represents an order made by a customer.
○ Attributes: OrderID (Primary Key), OrderDate, ShippingAddress
● OrderItem: Represents a specific book in an order.
○ Attributes: OrderItemID (Primary Key), Quantity, PriceAtPurchase
102
Define Relationships:
● Author to Book: An Author can write many Books, and each Book can have one or more
Authors (many-to-many relationship).

We create a relationship called "Writes" between Author and Book.

● Customer to Order: A Customer can place many Orders, but each Order is placed by one
Customer (one-to-many relationship).

Relationship: Places between Customer and Order.

● Order to OrderItem: Each Order can contain multiple OrderItems, but each OrderItem refers
to one specific Order (one-to-many relationship).

Relationship: Contains between Order and OrderItem.

● Book to Order: Each Order refers to one Book, and a Book can be part of many Order
(many-to-many relationship).

Relationship: Includes between Book and Order 103


Define Primary and Foreign Keys
Primary Keys:
○ Book: BookID
○ Author: AuthorID
○ Customer: CustomerID
○ Order: OrderID
○ OrderItem: OrderItemID
Foreign Keys:
○ Order has a foreign keys CustomerID to represent the customer placing the
order and BookID (to represent which book is being ordered).
○ OrderItem has foreign keys OrderID (to represent which order the item
belongs to)

104
Add Constraints
● Cardinality:

○ An Author can write many Books, but each Book can have many Authors
(many-to-many).
○ A Customer can place many Orders, but each Order is associated with exactly
one Customer (one-to-many).
○ An Order contains one or more OrderItems, but each OrderItem belongs to
exactly one Order (one-to-many).
○ A Book can be in many OrderItems, and each OrderItem contains many Book
(many-to-many).
● Participation Constraints:

○ A Customer must place at least one Order (total participation of Customer in


Order).
○ An Order must have at least one OrderItem (total participation of Order in
OrderItem).
105
Roles and Structural Constraints in ER Diagrams
Roles define the function an entity plays in a relationship.

In relationship between Book and Author is a "writes" relationship. Here, the roles would be:

● Author plays the role of writer.


● Book plays the role of written by.

In the relationship between Customer and Order, we can define the roles as:

● Customer plays the role of placer (the person who places an order).
● Order plays the role of order placed by.

106
Structural Constraints in ER Diagrams

● Cardinality Constraints define how many instances of an entity can


participate in a relationship with another entity.

● Participation Constraints define whether participation is mandatory or


optional for an entity in a relationship.

107
Weak Entity Types in ER Diagrams

Weak Entity Type: OrderItem

The OrderItem entity represents an item in a particular Order.

It cannot be uniquely identified without the Order to which it belongs and the Book
that is part of the order.

In this case, OrderItem depends on the Order and Book entities for its identity.

● Weak Entity: OrderItem


● Owner Entity: Order and Book

108
Naming Conventions in ER Diagrams
● Clearly identified and easy to understand

Entities:

● Book: Use singular nouns for entity names. The entity represents a single book, not
a collection of books. So, name it "Book" instead of "Books."
● Customer: Similarly, name the entity as Customer, not Customers, because it refers
to a single customer at a time.
● Order: The entity representing a customer's order should be called Order to
represent a singular order.
● Author: Use singular form for the entity. For example, Author instead of Authors.

109
Attributes:

● BookID: The attribute BookID should be used as a unique identifier (primary key) for
each book.
● OrderID: The primary key for Order would be OrderID.
● FirstName and LastName: These attributes should be clear and represent individual
pieces of information about Customer and Author entities.

Relationships:

● Writes
● Places
● Contains
● Includes

110
Design Issues in ER Diagrams
● Cardinality and Participation Constraints
● Avoid Redundancy
● Weak Entities
● Complex Relationships : Customer placing an Order and the Order containing
multiple OrderItems and OrderItem and Book could be more intricate if a book
is ordered multiple times by various customers.

111
Relationship Types of Degree Higher than Two
● In an ER diagram, relationships can connect more than two entities.
● When relationships involve three or more entities, they are referred to as n-ary
relationships, where n refers to the number of entities involved.

112
Ternary Relationships (Degree 3):

A ternary relationship involves three entities.


In Online Bookstore, a ternary relationship might exist between Order, Customer,
and Book in the following case:
● A Customer places an Order, and that Order contains a Book. This relationship is a
ternary relationship because it involves three entities: Customer, Order, and Book.
Example of a Ternary Relationship:
● Order: Represents the order placed by a Customer.
● Customer: Represents the person who places the Order.
● Book: Represents the Book being ordered.

113
Challenges of Ternary Relationships:

● A ternary relationship can be tricky because it involves more than two entities
● Decompose a ternary relationship into two binary relationships.
● For example, instead of using a ternary relationship to represent the connection
between Customer, Order, and Book, you could break it down into two separate
relationships:
○ Customer to Order (one-to-many)
○ Order to Book (many-to-many)

114
Decomposing Higher-Degree Relationships:

If the ternary relationship becomes too complex to represent, we can break it down into multiple
binary relationships.

For example:

● Customer → Order (1:M)


● Order → Book (M:N)

This approach simplifies the structure and makes it easier to enforce referential integrity and
ensure the database's relationships are manageable.

115
Why Decompose Ternary to Binary Relationships?

1. Simplicity: Binary relationships are easier to manage and query.


2. Referential Integrity: Easier to maintain with foreign keys.
3. Flexibility: More adaptable to schema changes or future additions.
4. Performance: Binary relationships tend to perform better due to fewer
joins and indexes.
5. Consistency: Adheres to traditional relational database design practices.
When NOT to Decompose a Ternary Relationship - When Unique
Transaction is involved.

116
REFINING THE COMPANY DATABASE
SCHEMA BY INTRODUCING RELATIONSHIPS
✔ By examining the requirements, six
relationship types are identified
✔ All are binary relationships( degree 2)
✔ Listed below with their participating entity
types:
✔ WORKS_FOR (between EMPLOYEE,
DEPARTMENT)
✔ MANAGES (also between EMPLOYEE,
DEPARTMENT)
✔ CONTROLS (between DEPARTMENT,
PROJECT)
✔ WORKS_ON (between EMPLOYEE, PROJECT)
ER DIAGRAM – RELATIONSHIP TYPES ARE:
WORKS_FOR, MANAGES, WORKS_ON, CONTROLS,
SUPERVISION, DEPENDENTS_OF
DISCUSSION ON RELATIONSHIP TYPES

✔ In the refined design, some attributes from


the initial entity types are refined into
relationships:
✔ Manager of DEPARTMENT -> MANAGES
✔ Works_on of EMPLOYEE -> WORKS_ON
✔ Department of EMPLOYEE -> WORKS_FOR
✔ etc
✔ In general, more than one relationship type
can exist between the same participating
entity types
✔ MANAGES and WORKS_FOR are distinct
relationship types between EMPLOYEE and
DEPARTMENT
RECURSIVE RELATIONSHIP TYPE

✔ An relationship type whose with the same


participating entity type in distinct roles
✔ Example: the SUPERVISION relationship
✔ EMPLOYEE participates twice in two distinct
roles:
✔ supervisor (or boss) role
✔ supervisee (or subordinate) role
✔ Each relationship instance relates two
distinct EMPLOYEE entities:
✔ One employee in supervisor role
✔ One employee in supervisee role
WEAK ENTITY TYPES

✔ An entity that does not have a key attribute


✔ A weak entity must participate in an identifying
relationship type with an owner or identifying entity
type
✔ Entities are identified by the combination of:
✔ A partial key of the weak entity type
✔ The particular entity they are related to in the
identifying entity type
✔ Example:
✔ A DEPENDENT entity is identified by the
dependent’s first name, and the specific
EMPLOYEE with whom the dependent is related
✔ Name of DEPENDENT is the partial key
✔ DEPENDENT is a weak entity type
✔ EMPLOYEE is its identifying entity type via the
identifying relationship type DEPENDENT_OF
CONSTRAINTS ON RELATIONSHIPS

✔ Constraints on Relationship Types


✔ (Also known as ratio constraints)
✔ Cardinality Ratio (specifies maximum
participation)
✔ One-to-one (1:1)
✔ One-to-many (1:N) or Many-to-one (N:1)
✔ Many-to-many (M:N)

✔ Existence Dependency Constraint (specifies


minimum participation) (also called
participation constraint)
✔ zero (optional participation, not existence-dependent)
✔ one or more (mandatory participation, existence-dependent)
MANY-TO-ONE (N:1) RELATIONSHIP
MANY-TO-MANY (M:N)
RELATIONSHIP
DISPLAYING A RECURSIVE
RELATIONSHIP
✔ In a recursive relationship type.
✔ Both participations are same entity type in
different roles.
✔ For example, SUPERVISION relationships
between EMPLOYEE (in role of supervisor
or boss) and (another) EMPLOYEE (in role
of subordinate or worker).
✔ In following figure, first role participation
labeled with 1 and second role participation
labeled with 2.
✔ In ER diagram, need to display role names
to distinguish participations.
A RECURSIVE RELATIONSHIP SUPERVISION
RECURSIVE RELATIONSHIP TYPE IS: SUPERVISION
(PARTICIPATION ROLE NAMES ARE SHOWN)
ATTRIBUTES OF RELATIONSHIP TYPES

✔ A relationship type can have attributes:


✔ For example, HoursPerWeek of WORKS_ON
✔ Its value for each relationship instance describes the number of hours
per week that an EMPLOYEE works on a PROJECT.
✔ A value of HoursPerWeek depends on a particular (employee,
project) combination
✔ Most relationship attributes are used with M:N relationships
✔ In 1:N relationships, they can be transferred to the entity type on the
N-side of the relationship
EXAMPLE ATTRIBUTE OF A RELATIONSHIP
TYPE:
HOURS OF WORKS_ON
NOTATION FOR CONSTRAINTS ON
RELATIONSHIPS
✔ Cardinality ratio (of a binary relationship): 1:1, 1:N, N:1, or M:N
✔ Shown by placing appropriate numbers on the relationship edges.
✔ Participation constraint (on each participating entity type): total (called
existence dependency) or partial.
✔ Total shown by double line, partial by single line.
✔ NOTE: These are easy to specify for Binary Relationship Types.
ALTERNATIVE (MIN, MAX) NOTATION FOR
RELATIONSHIP STRUCTURAL CONSTRAINTS:
✔ Specified on each participation of an entity type E in
a relationship type R
✔ Specifies that each entity e in E participates in at
least min and at most max relationship instances in
R
✔ Default(no constraint): min=0, max=n (signifying no
limit)
✔ Must have min≤max, min≥0, max ≥1
✔ Derived from the knowledge of mini-world
constraints
✔ Examples:
✔ A department has exactly one manager and an
employee can manage at most one department.
✔ Specify (0,1) for participation of EMPLOYEE in
MANAGES
✔ Specify (1,1) for participation of DEPARTMENT in
MANAGES
✔ An employee can work for exactly one department
but a department can have any number of
employees.
THE (MIN,MAX) NOTATION FOR
RELATIONSHIP CONSTRAINTS

Read the min,max numbers next to the entity type and


looking away from the entity type
COMPANY ER SCHEMA DIAGRAM USING (MIN,
MAX) NOTATION
ALTERNATIVE DIAGRAMMATIC NOTATION

✔ ER diagrams is one popular example for displaying database schemas

✔ Many other notations exist in the literature and in various database


design and modeling tools

✔ Appendix A illustrates some of the alternative notations that have been


used

✔ UML class diagrams is representative of another way of displaying ER


concepts that is used in several commercial design tools
SUMMARY OF NOTATION FOR ER DIAGRAMS
UML CLASS DIAGRAMS
✔ Represent classes (similar to entity types) as
large rounded boxes with three sections:
✔ Top section includes entity type (class) name
✔ Second section includes attributes
✔ Third section includes class operations
(operations are not in basic ER model)
✔ Relationships (called associations)
represented as lines connecting the classes
✔ Other UML terminology also differs from ER
terminology
✔ Used in database design and
object-oriented software design
UML CLASS DIAGRAM FOR COMPANY DATABASE
SCHEMA
OTHER ALTERNATIVE DIAGRAMMATIC NOTATIONS
RELATIONSHIPS OF HIGHER DEGREE

✔ Relationship types of degree 2 are called binary

✔ Relationship types of degree 3 are called ternary and of degree n are


called n-ary

✔ In general, an n-ary relationship is not equivalent to n binary


relationships

✔ Constraints are harder to specify for higher-degree relationships (n > 2)


than for binary relationships
DISCUSSION OF N-ARY RELATIONSHIPS (N >
2)
✔ In general, 3 binary relationships can represent
different information than a single ternary
relationship (see Figure 3.17a and b on next slide)
✔ If needed, the binary and n-ary relationships can all
be included in the schema design (see Figure 3.17a
and b, where all relationships convey different
meanings)
✔ In some cases, a ternary relationship can be
represented as a weak entity if the data model allows
a weak entity type to have multiple identifying
relationships (and hence multiple owner entity types)
(see Figure 3.17c)
✔ If a particular binary relationship can be derived from
a higher-degree relationship at all times, then it is
redundant
EXAMPLE OF A TERNARY RELATIONSHIP
ANOTHER EXAMPLE OF A TERNARY RELATIONSHIP
DISPLAYING CONSTRAINTS ON
HIGHER-DEGREE RELATIONSHIPS
✔ The (min, max) constraints can be displayed
on the edges – however, they do not fully
describe the constraints
✔ Displaying a 1, M, or N indicates additional
constraints
✔ An M or N indicates no constraint
✔ A 1 indicates that an entity can participate in
at most one relationship instance that has a
particular combination of the other
participating entities
✔ In general, both (min, max) and 1, M, or N are
needed to describe fully the constraints
DATA MODELING TOOLS

✔ A number of popular tools that cover


conceptual modeling and mapping into
relational schema design.
✔ Examples: ERWin, S- Designer (Enterprise
Application Suite), ER- Studio, etc.
✔ POSITIVES:
✔ Serves as documentation of application
requirements, easy user interface - mostly
graphics editor support
✔ NEGATIVES:
✔ Most tools lack a proper distinct notation for
relationships with relationship attributes
✔ Mostly represent a relational design in a
diagrammatic form rather than a conceptual
ER-based design
CHAPTER SUMMARY

✔ ER Model Concepts: Entities, attributes, relationships

✔ Constraints in the ER model

✔ Using ER in step-by-step conceptual schema design for the COMPANY


database

✔ ER Diagrams - Notation

✔ Alternative Notations – UML class diagrams, others

You might also like