0% found this document useful (0 votes)
76 views7 pages

04 Data Normalization and Erd 4-4-21

The document discusses data normalization, which is the process of organizing data in a database to reduce data redundancy and improve data integrity. It describes the various normal forms including 1NF, 2NF, 3NF, BCNF, and 4NF. The document also includes a case study demonstrating how to normalize a sample student table from 1NF to 3NF through removing repeating groups and dependencies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views7 pages

04 Data Normalization and Erd 4-4-21

The document discusses data normalization, which is the process of organizing data in a database to reduce data redundancy and improve data integrity. It describes the various normal forms including 1NF, 2NF, 3NF, BCNF, and 4NF. The document also includes a case study demonstrating how to normalize a sample student table from 1NF to 3NF through removing repeating groups and dependencies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Notes on Data Normalization

REVIEW ON DATA NORMALIZATION • Meet all the requirements of the third normal
form.
• A relation is in 4NF if it has no multi-valued
Data Normalization is the process of efficiently dependencies.
organizing data in a database by converting complex
data into SIMPLE and STABLE data structures. It is a DEFINITION OF TERMS
technique for producing a set of tables with desirable
properties that support the requirements of a user or Entity.
company. - Is a table or a file.

Steps to Normalize the Data Record.


- Is a row in the table or a tuple.
1. FIRST NORMAL FORM (1NF) - Any repeating
groups have been removed, so that there is a single Attribute.
value at the intersection of each row and column of - Is a column in the table or a property of an entity or a
the table: field.
• Eliminate duplicative columns from the same table.
• Create separate tables for each group of related Repeating Group.
data and identify each row with a unique column or - Attribute with multiple entries in the table.
set of columns (using the primary key).
Functional Dependency.
2. SECOND NORMAL FORM (2NF) - Any partial - attributes that are dependent with another properties
functional dependencies have been removed: Example:
• Meet all the requirements of the first normal EMP_NUM -> LAST_NAME, FIRST_NAME
form.
• Remove subsets of data that apply to multiple Multiple-valued Dependency.
rows of a table and place them in separate tables. - Occurs when there is many to many (M:N) relationship
• Create relationships between these new tables between 2 columns in a table.
and their predecessors through the use of foreign
keys. Relationship.
- Is the association between two or more entities.
3. THIRD NORMAL FORM (3NF) - Any transitive
dependencies have been removed: Identifier.
• Meet all the requirements of the second normal - Is the key attribute that identifies the record in an
form. entity.
• Remove columns that are not dependent upon the
primary key. Primary Key.
- Is an identifier that uniquely identifies each record in
the table. The key may consist of a single attribute or
4. BOYCE-CODD NORMAL (BCNF) FORM OR 3.5 multiple attributes in combination.
NORMAL FORM (3.5NF) - Is an extension to the 3NF
for the special case where: Foreign Key.
• Meet all the requirements of the third normal - Is a field in a relational table that matches the primary
form. key column of another table. The foreign key can be
• Every determinant must be a candidate key. used to cross-reference tables.
• There are at least two candidate keys in the table.
• All the candidate keys are composite keys. Candidate Key.
• There is an over lapping column in the candidate - Is a combination of attributes that can be uniquely
key. used to identify a database record without any
extraneous data. Each table may have one or more
5. FOURTH NORMAL FORM (4NF) - If it satisfies candidate keys. One of these candidate keys is selected
as the table primary key.
the 3NF and if there is only one multi-valued
dependency per table.

Page 1
Prepared by: May S. Cuaycong, MSCS Printed: 04/04/21
Notes on Data Normalization

CASE STUDY 1:

Convert the STUDENT table to third normal form. In


this table, STUDENT_NUM determines Finally, assign names to each of the new tables:
STUDENT_NAME, NUM_CREDITS, ADVISOR_NUM, and
STUDENT (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS,
ADVISOR_NUM determines ADVISOR_NAME, ADVISOR_NUM, ADVISOR_NAME)
COURSE_NUM determines DESCRIPTION. The COURSE (COURSE_NUM, DESCRIPTION)
combination of a STUDENT_NUM and COURSE_NUM STUDENT_COURSE (STUDENT_NUM, COURSE_NUM, GRADE)
determines GRADE.
Although these tables are all in second form, the
The data design structure is COURSE and GRADE tables are also in third normal
form. The STUDENT table is not in third normal form,
STUDENT (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS, however, because it contains a determinant
ADVISOR_NUM, ADVISOR_NAME, (COURSE_NUM, (ADVISOR_NUM) that is not a candidate key.
DESCRIPTION, GRADE))

STEP 3. USING 3NF TECHNIQUE – convert the


Now, complete the normalization process from STUDENT table to third normal form by removing the
1NF to 3NF. column that depends on the determinant
ADVISOR_NUM and placing it in a separate table and
ANSWER: identify the key/s for each table, as follows:

STEP 1. USING 1NF TECHNIQUE – remove the (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS,


ADVISOR_NUM)
repeating group to convert the table to its first normal
(ADVISOR_NUM, ADVISOR_NAME)
form.
Then name the tables and put the entire collection
STUDENT (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS,
ADVISOR_NUM, ADVISOR_NAME, COURSE_NUM,
together, as follows:
DESCRIPTION, GRADE)
STUDENT (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS,
ADVISOR_NUM)
The STUDENT table is now in first normal form
ADVISOR (ADVISOR_NUM, ADVISOR_NAME)
because it has no repeating groups. It is not, COURSE (COURSE_NUM, DESCRIPTION)
however, in second normal form because STUDENT_COURSE (STUDENT_NUM, COURSE_NUM, GRADE)
STUDENT_NAME is dependent only on
STUDENT_NUM, which is only a portion of the primary
key.

STEP 2. USING 2NF TECHNIQUE – convert the


STUDENT table to second normal form, first, for each
subset of the primary key, start a table with that
subset as its key yielding the following:

(STUDENT_NUM,
(COURSE_NUM,
(STUDENT_NUM, COURSE_NUM,

Next, place the rest of the columns with the smallest


collection of columns on which they depend and
identify the key attribute of each group, as follows:

(STUDENT_NUM, STUDENT_NAME, NUM_CREDITS,


ADVISOR_NUM, ADVISOR_NAME)
(COURSE_NUM, DESCRIPTION)
(STUDENT_NUM, COURSE_NUM, GRADE)

Page 2
Prepared by: May S. Cuaycong, MSCS Printed: 04/04/21
Review Notes on Data Administration

REVIEW ON ENTITY RELATIONAL DIAGRAM


Data modeling – a technique for organizing and Example:
documenting a system’s data. Sometimes called
database modeling.
STUDENT
Entity relationship diagram (ER-D) – a data model
STUDENT_ID
utilizing several notations to depict data in terms of NAME
the entities and relationships described by that data. First_Name
Middle_Name
Last_Name
Two approaches of ER-D
ADDRESS
Street
1. Conceptual ER-D – representing the data model City
in general context of the database. Province
2. Fully-Attributed ER-D – representing the data AGE

model in all the details of the attributes,


relationship, and identifiers of the entities in the
database.
Data type – a property of an attribute that identifies
what type of data can be stored in that attribute.
Two types of ER-D Notation
Domain – a property of an attribute that defines what
1. Chen’s Notation values an attribute can legitimately take on.

Example:

2. Crow’s Foot Notation Data Domain Examples


Type
NUMBER For integers, specify the {10-99}
range.

DATE Variation on the MMDDYYYY


MMDDYYYY format.

Concepts of ER-D TEXT Maximum size of Text(30)


attribute. Actual values
Entity - a class of persons, places, objects, events, or usually infinite; however,
users may specify certain
concepts about, which need to capture and store
narrative restrictions.
data. Named by a singular noun.

Example: Default value – the value that will be recorded if a


value is not specified by the user.
STUDENT
Key – an attribute, or a group of attributes, that
assumes a unique value for each entity instance. It is
sometimes called an identifier.
Attribute – a descriptive property or characteristic of
an entity. Synonyms include element, property, and Example:
field.
STUDENT
Compound attribute – an attribute that consists of
other attributes. Synonyms in different data modeling STUDENT_ID (Primary Key)
NAME
languages are numerous: concatenated attribute, First_Name
composite attribute, and data structure. Middle_Name
Last_Name

ADDRESS
Page 3 Street
Prepared by: May S. Cuaycong, MSCS City Printed: 04/04/21
Province
AGE
Review Notes on Data Administration

Relationship – a natural business association that Example:


exists between one or more entities.

Cardinality – the minimum and maximum number of


occurrences of one entity that may be related to a
single occurrence of the other entity. A relationship between three entities is called a 3-ary
or ternary relationship.
Example:
Example:
STUDENT CURRICULUM
enrols

Cardinality Notations

Cardinality Minimum Maximum


Exactly one (one and only 1 1 A relationship between four entities is called a 4-ary
one) or quarter-nary relationship.

Example:
or

Zero or one 0 1

One or many 1 >1

Zero, one, or more 0 >1


A relationship between different instances of the
same entity is called a recursive relationship.
More than one >1 >1
Example:

Degree – the number of entities that participate in the


relationship.

A relationship between two entities is called a binary


relationship.

Page 4
Prepared by: May S. Cuaycong Printed: 04/04/21
Review Notes on Data Administration

CASE STUDY 2:

Given the data design structure of database below, draw the following:
1. Conceptual ER-D using the Chen’s Notation
2. Fully-attributed ER-D using the Chen’s Notation
3. Conceptual ER-D using the Crow’s Foot Notation
4. Fully-attributed ER-D using the Crow’s Foot Notation

STUDENT (STUDENT_NUM, STUDENT_NAME, NUM_CREDITS, ADVISOR_NUM)


ADVISOR (ADVISOR_NUM, ADVISOR_NAME)
COURSE (COURSE_NUM, DESCRIPTION)
STUDENT_COURSE (STUDENT_NUM, COURSE_NUM, GRADE)

CASE STUDY 3:

Given the entities found in EMPLOYEE database, analyze and prepare the following database model requirements:
1. Draw the Data Model using the Conceptual ER-D by utilizing the Chen’s Notation.
2. Draw the Data Model using the Fully-attributed ER-D by utilizing the Chen’s Notation.
3. Draw the Data Model using the Conceptual ER-D by utilizing the Crow’s Foot Notation.
4. Draw the Data Model using the Fully-Attributed ER-D by utilizing the Crow’s Foot Notation.

STAFF TABLE (alias SN)


staffno Staff Number (ascending index, PK), 4 numeric
name Staff Full Name, 50 characters
position Staff Position, criteria set (Manager, Staff, Clerk), 10 characters
salary Staff Salary, 8 numeric with 2 decimal

BRANCH TABLE (alias SB)


staffno Staff Number (ascending index, FK), 4 numeric
branchno Branch Number (ascending index, FK), 4 characters

ADDRESS TABLE (alias SA)


branchno Branch Number (ascending index, PK) 4 characters
address Branch Address, 40 characters

TELEPHONE TABLE (alias ST)


branchno Branch Number (ascending index, FK) 4 characters
telephone Branch Telephone Number, 15 characters

Page 5
Prepared by: May S. Cuaycong Printed: 04/04/21
Review Notes on Data Administration

ANSWER:
1. Conceptual ER-D using the Chen’s Notation

STAFF ADDRESS

works BRANCH has

has

TELEPHONE

2. Fully-attributed ER-D using the Chen’s Notation

NAME POSITION
BRANCHNO
STAFFNO ADDRESS
SALARY

STAFF ADDRESS

STAFFNO
BRANCHNO

works BRANCH has

has

TELEPHONE

BRANCHNO ADDRESS

Page 6
Prepared by: May S. Cuaycong Printed: 04/04/21
Review Notes on Data Administration

3. Conceptual ER-D using the Crow’s Foot Notation

ADDRESS
STAFF

works has
BRANCH

has

TELEPHONE

4. Fully-attributed ER-D using the Crow’s Foot Notation

STAFF ADDRESS
Staffno (PK) Branchno (PK)
Name address
Position
Salary

works BRANCH has


Staffno (FK)
Branchno (FK)

has

TELEPHONE
Branchno (FK)
Telephone

SAMPLE DATA DICTIONARY


RANGE/
REQUIRED INDEX
TABLE NAME ATTRIBUTE CONTENTS TYPE FORMAT Criteria
(y/n) (PK/FK)
Set
STAFF staffno Staff ID number char XX-9999 y PK
(alias SN) name Full name of employee char X(45) y
position Employee’s position char X(7) Manager, y
Staff, Clerk
salary Employee’s net earnings for the numeric 99,999.99 5,000- n
month 50,000
BRANCH staffno Staff ID number char XX-9999 y FK
(alias SB) branchno Branch number char XX-99 y FK
ADDRESS branchno Branch number char XX-99 y PK
(alias SA) address Branch address char X(40) y
TELEPHONE branchno Branch number char XX-99 y FK
(alias ST) telephone Branch telephone number char X(15) N

Page 7
Prepared by: May S. Cuaycong Printed: 04/04/21

You might also like