0% found this document useful (0 votes)
22 views17 pages

07-Data Modeling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views17 pages

07-Data Modeling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

1 2

Data Modeling Overview

ITSS SOFTWARE DEVELOPMENT Analysis Classes Design Classes


Supplementary
7. DATA MODELING Specifications

Data Modeling

Data Model

Project Specific
Some slides extracted from IBM coursewares Guidelines Design Model

1 2

3 4

Content 1. Data Models


1. Data models ◆Data modeling:
• Abstracting and organizing the structure of real-world
2. Object model and Rational Data Model information, which is the object to be made into a
3. Mapping class diagram to E-R diagram database, and then expressing it

4. Normalization

3 4

1
5 6

1. Data Models (2) 1.1. Conceptual data model


• 3 types of data models • Naturual expressions without constraints imposed
by DBMS
• E-R model
• Expressed by E-R diagram

5 6

7 8

E-R Diagram 1.2. Logical Data Model


• Three elements • 3 types
• Entities • relational model,
• Relationships
• network model,
• Attributes
• and hierarchical model

7 8

2
9 10

1.3.1. Hierarchical Database


1.3. Physical Data Model
(Tree-Structure Database)
• Logical data models, when they are implemented,
• Divides records into parents and children and shows the
become physical data models: relationship with a hierarchical structure
• relational databases, • 1-to-many (1:n) correspondences between parent records
• network databases, and child records
• or hierarchical databases

9 10

11 12

1.3.2. Network Database 1.3.3. Rational database


• Parent records and child records do not have 1-to-n (1:n) • Data is expressed in a two-dimensional table.
correspondences; rather, they are in many-to-many (m:n) • Each row of the table corresponds to a record, and
correspondence each column is an item of the records.
• Sometimes called CODASYL database
• The underlined columns indicate the primary key

11 12

3
14

What is NoSQL Database?


• NoSQL (cloud) databases
• Use document-based model (non-relational)
• Schema-free document storage
• Still support indexing and querying
• Still support CRUD operations (create, read, update,
delete)
• Still supports concurrency and transactions
• Highly optimized for append / retrieve
NOSQL DATABASES • Great performance and scalability
Overview, Models, Concepts, Examples • NoSQL == “No SQL” or “Not Only SQL”?

13 14

15 16

Relational vs. NoSQL Databases Relational vs. NoSQL Models


Relational Model Document Model
• Relational databases
Name Svetlin Nakov
• Data stored as table rows Gender male Name: Svetlin Nakov

• Relationships between related rows Phone +359333777555 Gender: male

• Single entity spans multiple tables Email [email protected] Phone: +359333777555


Site www.nakov.com
• RDBMS systems are very mature, rock solid Address:
* - Street: Al. Malinov 31
• NoSQL databases 1
Street Al. Malinov 31 - Post Code: 1729
• Data stored as documents
Post Code 1729 - Town: Sofia
• Single entity (document) is a single record
* - Country: Bulgaria
• Documents do not have a fixed structure 1
Town Sofia Email: [email protected]
* Site: www.nakov.com
1
Country Bulgaria

15 16

4
17 18

Normalization vs. Aggregation Aggregates vs. Joins

17 18

19 20

Atomic Aggregates 1.3.4. Three-layer schema


• A schema is a description of the framework of a
database
• Classified into 3 types:

19 20

5
21 22

Content 2.1. Relational Databases and OO


• RDBMS and Object Orientation are not
1. Data models
entirely compatible
2. Object model and Rational Data Model • RDBMS
3. Mapping class diagram to E-R diagram • Focus is on data
• Better suited for ad-hoc relationships and reporting application
4. Normalization • Expose data (column values)

• Object Oriented system


• Focus is on behavior
• Better suited to handle state-specific behavior where data is
secondary
• Hide data (encapsulation)

21 22

23 24

2.2. The Object Model 2.3. The Relational Data Model


• The Object Model is composed of • Relational data model is composed of
• Classes (attributes) • Entities - Table
LineItem
• Relationships Order +lineItems - quantity : Integer • Relations - Relationship
- number : Integer
+order - number : Integer
• Associations 1..*
→ Also called E-R model
• Generalization 1
Product
Entity
- number : Integer Relation
- description : String
- unitPrice : Double

products
ORDER LINE ITEM
Order_Id LineItem_Id
lineItems
Software Product Hardware Product Description
Price PRODUCT
- version : Double - assembly : String
Quantity
lineItem order
Product_Id Product_Id
Order_Id Name
Price

Columns

23 24

6
25 26

2.3.1. Entities/Tables 2.3.2. Relations/Relationships


• Entities is mapped to table when design physical • Relations between entities or relationship
database between tables
• Including • Multiplicity/Cardinality
• Columns: Attributes • One-to-one (1:1)
• Rows: Concrete values of attributes • One-to-many (1:m)
• Many-to-one (m:1)
Columns
• Many-to-many (m:n)

courseID description startDate endDate location


(Normally, many-to-many relation is devided to one-to-
This course… 12 Nov 2008 30 Nov 2008 D3-405
many and many-to-one relations)
2008.11.001
Rows
2008.11.002 This course… 22 Nov 2008 10 Dec 2008 T-403

25 26

27 28

Dependency relationships Independency relationships


• The child entity can exist only when the parent • The child entity can exist even if the parent entity
entity exists does not exist
• The child entity has a foreign key referencing to • The child entity has a foreign key referencing to
the primary key of the parent entity the primary key of the parent entity
• This foreign key is included in the primary key of • This foreign key is not included in the primary key
the child of the child
• Solid line • Dash line
ChiTietHD KhachHang HoaDon
HoaDon Sach
maHD maKhach maHD
maHD tenKhach
maSach maSach maKhach
tenKhach diaChi
SL tenSach ngayLap
ngayLap maSoThue
donGia

27 28

7
29 30

Content 3. Mapping class diagram to E-R diagram


LineItem
Order +lineItems - quantity : Integer

1. Data models
- number : Integer
+order 1..*
- number : Integer • Map persistent design
1 classes to Entities
2. Object model and Rational Data Model Product
- number : Integer
- description : String • Map class relationships to
- unitPrice : Double

3. Mapping class diagram to E-R diagram Relations

4. Normalization Software Product


- version : Double
Hardware Product
- assembly : String Entity Relation

products
ORDER LINE ITEM
Order_Id LineItem_Id
lineItems
Description
Price PRODUCT
Quantity
lineItem order Product_Id
Product_Id
Order_Id Name
Price

Columns

29 30

31 32

3.2. Mapping Associations Between Persistent Objects


3.1. Mapping Persistent Design Classes to Entities
• Associations between two persistent objects are realized
• In a relational database as foreign keys to the associated objects.
• Every row is regarded as an object • A foreign key (not in primary key) is a column in one table that
• A column in a table is equivalent to a persistent attribute of a class contains the primary key value of associated object
• → Independency relationship
SubjectInfo Course entity
- subjectID : String
CourseInfo courseID description startDate endDate location
- subjectName : String
- numberOfCredit : int -courseID: String This course… 12 Nov 2008 30 Nov 2008 D3-405
IT3598002
-description: String
-startDate: DateTime
Primary Key
-endDate: DateTime
-location: String

Attributes from subjectID subjectName numberOfCredit 1 Foreign Key


StudyHistory entity
object type
StudyHistory
3..30 historyNo studentID Result courseID …
Object Instance IT0001 CS Introduction 4 - historyNo
-resullt 5 2005.03229 A IT3598002
-…

31 32

8
33 34

3.3. Mapping Aggregation to the Data Model


3.3. Mapping Aggregation to the Data Model (2)
• Aggregation is also modeled to dependency • In some case, we can map to independency relationship
relationship using foreign key relationships to simplify the primary key.
• The use of composition implements a cascading delete • Example: CourseID is the primary key (according the
constraint requirements)
SubjectInfo Subject entity
Course entity - subjectID subjectID subjectName goal …
-subjectName
CourseRegistrationInfo courseID description startDate endDate subjectID -… IT3598 Object-Oriented Language and Theory After finish…

IT3598002 This course… 12 Nov 2008 30 Nov 2008 IT0001 1


- registeredDate Primary Key

Primary Key 0..*


CourseInfo
0..30 Foreign Key
CourseInfo Foreign Key CourseRegistration entity -courseID: String Course entity
-courseID: String -description: String
1
-description: String courseID studentID registeredDate -startDate: DateTime courseID description startDate endDate location subjectID
-startDate: DateTime -endDate: DateTime
-location: String IT3598002 This course… 12 Jan 2010 30 May 2009 D4-405 IT3598
-endDate: DateTime IT3598002 2005.03229 10 Oct 2008
-location: String

33 34

35 36

More example in Course Registration 3.4. Modeling Inheritance in the Data Model

• A Data Model does not support modeling


CourseInfo
Course entity
inheritance in a direct way
-courseID: String courseID description startDate endDate subjectID
-description: String
This course… 12 Jan 2010 30 Nov 2008 IT3598
• Two options:
-startDate: DateTime IT3598002
-endDate: DateTime
Primary Key
• Use separate tables (normalized data)
-location: String

1
• Duplicate all inherited associations and attributes (de-
Foreign Key Schedule entity
normalized data)
Schedule
scheduleI courseID day teachingPerio
-scheduleID: int D d
0.* -day: String
-teachingPeriod: int 1 IT3598002 Tuesda 2
y
2 IT3598002 Tuesda 3
y
1 IT3672001 Friday 8

35 36

9
37
UserInfo UserInfo
- userID: String - userID: String

Normalized data De-normalized data


- userName : String - userName : String
- email : String - email : String
- phoneNumber : String - phoneNumber : String
- address : String - address : String

Student Lecturer Student Lecturer


- - eduBackground : String - - eduBackground : String

User entity
Student entity
userID userName email phoneNumber address
studentID studentName email phoneNumber address
2005.03229 Nguyễn Hoàng student_200503229@ 0988.394.394 No. 12, …
2005.03229 Nguyễn Hoàng student_200503229@ 0988.394.394 No. 12, …
hut.edu.vn
hut.edu.vn
002.005.00060 Trần Nam Khánh [email protected] 0912.473.568 No. 157, …
Replication
Primary Key Lecturer entity
Lecturer entity
lecturerID lecturerName email phoneNumber address eduBackground
lecturerID eduBackground
002.005.00060 Trần Nam Khánh khanhtn@ 0912.473.568 No. 157, Master from HUT…
Primary Key 002.005.00060 Master from HUT… hut.edu.vn …
38

37 38

40

3.5. Mapping many-to-many cardinality Example in Course Registration CS


• Use an intermediate entity
SubjectInfo
• Example: The Cardinality of A and B is many-to-many - subjectID: String 0..*
• Add an intermediate entity called “C” - subjectName: String
- goal: String
• Place 2 foreign keys for C, referencing to 2 primary keys of A and B - numberOfCredits: int prerequisites
• Add attributes to C if necessary. - description: String

C
C_ID
A_ID (FK)
0..3
A B_ID (FK)
A_ID c_attr_1
A * * B c_attr_2

A_ID B_ID
SubjectInfo Prerequisite
prerequisiteID
subjectID subjectID
C
B subjectName prerequisiteSubjectID
c_attr_1 level
c_attr_2 B_ID numberOfCredit
decription
39

39 40

10
42

E-R diagram Content


dm E-R Modeling

Schedule

«column»
CourseRegistration

«column»
Course
*PK scheduleID
*PK courseID
day
1. Data models
*PK courseID «column» teachingPeriod
*PK studentID
studentName
registeredDate
*PK courseID
subjectID
description
«PK» 2. Object model and Rational Data Model
Dependency + PK_Schedule(, )
startDate

3. Mapping class diagram to E-R diagram


Dependency
«PK» endDate
+ PK_CourseRegistration(, ) location
lecturer

Subj ect
«PK»
+ PK_Course()
StudyHistory

«column»
4. Normalization
*PK historyNo
«column»
courseID
*PK subjectID
studentID
subjectName
studentName
goal
result
numberOfCredit
pass
description Prerequisite

«column» «PK»
«PK»
*PK prerequisiteID + PK_StudyHistory()
+ PK_Subject()
* subjectID
prerequisiteSubjectID
level
description

«PK»
41 + PK_Prerequisite()

41 42

43 44

4.1. Overview of Normalization 4.1. Overview of Normalization (2)


• Normalization: the process of steps that will • In relational model, methods exist for quantifying
how efficient a database is.
identify, for elimination, redundancies in a
• These classifications are called normal forms (or
database design. NF), and there are algorithms for converting a
• Purpose of Normalization: to improve given database between them.
• storage efficiency • Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined
• data integrity
or linked each time a query is issued
• and scalability

43 44

11
45 46

4.2. History 4.3. Normal Forms


• Edgar F. Codd first proposed the process of • Edgar F. Codd originally established three normal
normalization and what came to be known as the forms: 1NF, 2NF and 3NF.
1st normal form in his paper A Relational Model • There are now others that are generally accepted,
of Data for Large Shared Data Banks Codd but 3NF is widely considered to be sufficient for
stated: most applications.
“There is, in fact, a very simple elimination • Most tables when reaching 3NF are also in BCNF
procedure which we shall call normalization. (Boyce-Codd Normal Form).
Through decomposition nonsimple domains are
replaced by ‘domains whose elements are atomic
(nondecomposable) values”.

45 46

47 48

Functionally determines Normal forms so Far…

◆First normal form


• All data values are atomic,
• In a table, a set of columns X, functionally determines and so everything fits into a
another column Y… mathematical relation. ◆Third normal form
• As 2NF plus no non-primary-
X→Y ◆Second normal form key attribute depends
transitively on the primary
• As 1NF plus no non-primary-
… if and only if each X value is associated with at key attribute is partially
key
most one Y value in a table. dependant on the primary
key
• i.e. if you know X then there is only one
possibility for Y.

47 48

12
49 50

Normalization Example Functional Dependencies


◆Consider a table ◆Columns
representing orders in an • Order Each order is for a single customer {Order} → {Customer}
online store • Product
• Customer
• Address Each customer has a single address {Customer} → {Address}
◆Each entry in the table
• Quantity
represents an item on a
• UnitPrice
particular order. (thinking Each product has a single price {Product} → {UnitPrice}
in terms of records. Yuk.)
◆Primary key is {Order,
Product} FD’s 1 and 2 are transitive {Order} → {Address}

49 50

51 52

Example – FD Diagram Normalization to 2NF


◆Remember 2nd normal form means no partial
1NF dependencies on the key. But we have:
{Order} → {Customer, Address}
R {Product} → {UnitPrice}
Order Product Customer Address Quantity UnitPrice
And a primary key of: {Order, Product}

• So to get rid of the first FD we project over:


{Order, Customer, Address}
and
{Order, Product, Quantity and UnitPrice}

51 52

13
53 54

Normalization to 2NF Normalization to 2NF


1NF
R ◆R1 is now in 2NF, but there is still a partial FD in R2:
Order Product Customer Address Quantity UnitPrice
{Product} → {UnitPrice}

Order Product Quantity UnitPrice

R1 Order Customer Address


• To remove this we project over:
{Product, UnitPrice} and {Order, Product, Quantity}
R2 Order Product Quantity UnitPrice

53 54

55 56

Normalization to 2NF Now let’s go 3NF…


1NF R2 • R has now been split into 3 relations - R1, R3, and R4… but R1
Order Product Quantity UnitPrice has a transitive FD on its key…

R1 Order Customer Address

{Order} → {Customer} → {Address}


2NF
• To remove this problem we project R1 over:
R3 R4
{Order, Customer} and {Customer, Address}
Product UnitPrice Order Product Quantity

55 56

14
57 58

So more chopping… Let’s summarize that:

• 1NF:
2NF {Order, Product, Customer, Address, Quantity, UnitPrice}
R1 Order Customer Address
• 2NF:
{Order, Customer, Address}
{Product, UnitPrice}
{Order, Product, Quantity}

• 3NF:
3NF {Product, UnitPrice}
{Order, Product, Quantity}
R5 Order Customer R6 Customer Address {Order, Customer}
{Customer, Address}

57 58

59 60

So this… has become this…

0NF 3NF
Prices Product UnitPrice

R Amounts Order Product Quantity


Order Product Customer Address Quantity UnitPrice
Purchase Order Customer

Details Customer Address

59 60

15
dm E-R Modeling

61 62 Schedule

«column»
CourseRegistration *PK scheduleID

“Register for course” use case


Course *PK courseID
«column» day
*PK courseID «column» teachingPeriod
*PK studentID *PK courseID
studentName subjectID «PK»
registeredDate description Dependency + PK_Schedule(, )
• Make the E-R diagram from the previous «PK»
Dependency startDate
endDate
location
step for “Register for course” use case to
+ PK_CourseRegistration(, )
lecturer
StudyHistory
«PK»
become: Subj ect + PK_Course() «column»
*PK historyNo
«column»
• The first normal form
courseID
*PK subjectID
studentID
subjectName
studentName
goal
• The second normal form
result
numberOfCredit
pass
description Prerequisite

• The third normal form «PK»


+ PK_Subject()
«column»
*PK prerequisiteID
«PK»
+ PK_StudyHistory()
* subjectID
prerequisiteSubjectID
level
description

«PK»
+ PK_Prerequisite()

1NF

61 62

1NF 63
dm Normalization 2
2NF 64

dm Normalization 1
Schedule
CourseRegistration Course
Schedule
CourseRegistration Course «column»
«column» «column»
«column» *PK scheduleID
*PK courseID
«column» «column» *PK scheduleID
*PK courseID day
* subjectID
*PK courseID *PK courseID *PK courseID
*PK studentID teachingPeriod
registeredDate description
*PK studentID subjectID day
courseID
startDate
studentName description teachingPeriod endDate
registeredDate startDate «PK» «PK»
location
endDate + PK_CourseRegistration(, ) + PK_Schedule()
«PK» lecturerID
location StudyHistory
«PK» lecturerID + PK_Schedule(, )
+ PK_CourseRegistration(, ) StudyHistory «PK»
«column»
+ PK_Course()
«PK» «column»
*PK historyNo

+ PK_Course() studentID
*PK historyNo
Student studentName
Subj ect courseID
result
studentID pass
«column»
«column» studentName
*PK studentID courseID
*PK subjectID result
studentName
subjectName pass
phone «PK»
goal Prerequisite
address + PK_StudyHistory()
numberOfCredit
Prerequisite «PK» email
description + PK_StudyHistory() «column»
Lecturer
*PK subjectID Lecturer
«column» «PK» Subj ect
«PK» *PK prerequisiteID «column» + PK_Student()
*PK prerequisiteID
level «column» «column»
+ PK_Subject() * subjectID *PK lecturerID
description *PK subjectID *PK lecturerID
prerequisiteSubjectID lecturerName
subjectName lecturerName
level educationalBackground educationalBackground
«PK» goal
description phone phone
+ PK_Prerequisite(, ) numberOfCredit
email email
description
«PK» address address
+ PK_Prerequisite() «PK»
«PK» «PK»
+ PK_Subject()
+ PK_Lecturer() + PK_Lecturer()

63 64

16
3NF 65 66

dm Normalization 3

CourseRegistration Course
Schedule

«column»
Question?
«column» «column»
*PK scheduleID
*PK courseID *PK courseID
*PK courseID
*PK studentID subjectID day
registeredDate description teachingPeriod
startDate
endDate
«PK» «PK»
location
+ PK_CourseRegistration(, ) + PK_Schedule(, )
lecturerID
StudyHistory
«PK»
«column»
+ PK_Course()
*PK historyNo
courseID
Student studentID
result
«column» pass
*PK studentID
studentName «PK»
phone + PK_StudyHistory()
address Prerequisite
email
«column»
*PK prerequisiteID Subj ect Lecturer
«PK»
* subjectID
+ PK_Student() «column»
prerequisiteSubjectID «column»
level *PK subjectID *PK lecturerID
description subjectName lecturerName
goal educationalBackground
numberOfCredit phone
«PK»
description email
+ PK_Prerequisite()
address

«PK»
+ PK_Subject() «PK»
+ PK_Lecturer()

65 66

17

You might also like