Assignment 1 - Narrative

The document describes the normalization of inventory, sales, customer, and discount data from multiple flat files into a relational database schema. Key attributes were identified and used to establish relationships between tables using primary and foreign keys. Column headers were added to organize the data. The schema reduces redundancy and supports data independence while meeting data curation goals.

Uploaded by

Ashish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views6 pages

Assignment 1 - Narrative

Uploaded by

Ashish Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

INVENTORY DATA FILE -

MDS_Exercise1_FileA.txt is a flat data file contains inventory data in space delimited format.
Files doesn’t not have column heading (attribute) and there is some issue with formatting that’s
causing data overlapping with other columns data. I added following data headers to identify
the file’s data:
1. ID
2. VIN
3. YEAR
4. MAKE
5. MODEL
6. VERSION
7. DRIVE_TYPE
8. COLOR
9. BODY_STYLE
10. FUEL
12. MSRP

Files contains duplicate data in row # 2, example “S 2.0L 4WD 4WD” and some space delimiter
issue as well. Following data shared among Inventory and Sales:
1. VIN
2. Model
3. Year
4. Color
5. Fuel
6. MSRP

Question - How did you decide to represent the data in the way that you did?
Answer - Analyzed Inventory and sales data and identified similar data and their columns
heading and based on observation added column heading in the Inventory. Inventory data was
not organized but based in data in same column and other columns identified and organized
data in the table.

Question - Did you leave out any information? If so, why?

Answer – No, included all data after organizing.

Question - Why did you choose certain things as attributes? As keys?

Answer – In Inventory selected attribute VIN as primary key because VIN is unique data and
present in Inventory and Sales both. We can establish relation between Sales and Inventory
using VIN Primary Key.

Question - What were the hardest decisions you had to make in this design process?
Answer – Included all data as it is after identifying column heading and organizing data, so note
taken any hard decision to add additional column or leave any column.
Question - How does your schema design support data independence?
Answer – Schema designed in relational model and it support new data column addition,
deletion and update without impacting existing data in the table. Current schema designed
support both logical and physical data independency.
Question - How may your schema design support the overarching goals of data curation (revisit
objectives and activities of Week 1)?
Answer – Schema designed in relational model and relation schema support overreaching goal
of data curation.
Question - Which curation activities could enhance or sustain the database for future discovery
and use for new purposes? What additional activities would you recommend?
Answer – Data curation activities like documentation, data authentication, archiving and
management will enhance and sustain the database for future discovery and use for new
purpose.
NOTE - Inventory table schema, attribute’s description and data are stored in the excel sheet
Inventory_Data and ER diagram in the ERDiagramAndSchema sheet.
SALES DATA FILE –
MDS_Exercise1_FileB.csv is flat file contains sales data in comma “,” separated format. Files
contains column heading (attribute) and data formatting seems organized. Sales file contains
data shared among Inventory and Customer_relation data files:
Following data shared with Inventory:
1. VIN
2. Model
3. Year
4. Color
5. Engine
6. MSRP
Following data shared with Customer_relation:
1. LastName
2. FirstName
3. MI
4. Address
5. City
6. State
7. Country
Question - How did you decide to represent the data in the way that you did?
Answer - Analyzed sales data and compared with Inventory and customer_relation data and
identified similar data and their columns heading to find out to remove redundancy and
normalize the data. Based on observation, selected data that is unique for sales only and moved
other commons data in their respective data files Inventory and customer relation tables.
Question - Did you leave out any information? If so, why?
Answer – Yes, I not included some fields in the sales because to normalize and avoid
redundancy I excluded following fields of data -Model, Year, Color and Engine because these
fields are already in the Inventory and excluded LAstName, FirstName, MI, Address, City, State
and country because these fields are already in the customer_relation.
Question - Why did you choose certain things as attributes? As keys?
Answer – I added new field “CUST_ID” as foreign key to establish relation with
customer_relation tables primary key “CUST_ID”. Added foreign key VIN to establish relation
with Inventory tables Primary Key VIN.
Question - What were the hardest decisions you had to make in this design process?
Answer – Added new field “CUST_ID” to establish relation with customer_relation table to
avoid using composites key (FirstName + LastName). Similar First Name and Last Name can
create problem.
Question - How does your schema design support data independence?
Answer – Schema designed in relational model and it support new data column addition,
deletion and update without impacting existing data in the table. Current schema designed
support both logical and physical data independency.
Question - How may your schema design support the overarching goals of data curation (revisit
objectives and activities of Week 1)?
Answer – Schema designed in relational model and relation schema support overreaching goal
of data curation.
Question - Which curation activities could enhance or sustain the database for future discovery
and use for new purposes? What additional activities would you recommend?
Answer – Data curation activities like documentation, data authentication, archiving and
management will enhance and sustain the database for future discovery and use for new
purpose.
NOTE - Sales table schema, attribute’s description and data are stored in the excel sheet
Sales_Data and ER diagram in the ERDiagramAndSchema sheet.
CUSTOMER_RELATION DATA FILE –
MDS_Exercise1_FileC.docx is a MS word document file contains customer_relation data in plain
text format. File doesn’t contain data column heading to identify the data. It’s used next line
and space delimiter to keep data organized. After identifying the data added following column’s
heading:
1. FIRST_NAME
2. LAST_NAME
3. MI
4. PROFESSION
5. ADDRESS
6. CITY
7. STATE
8. ZIP
9. COUNTRY
10. FINANCING
In customer_relation data file following data is common and shared among with Sales data:
1. LastName
2. FirstName
3. MI
4. Address
5. City
6. State
7. Country
Question - How did you decide to represent the data in the way that you did?
Answer - Analyzed customer_relation data and compared with sales and identified similar data
and their columns heading to remove redundancy and normalize the data. And based on
observation I selected data that is unique in customer_relation only.
Question - Did you leave out any information? If so, why?
Answer – No, included all data after organizing it and added an additional filed CUST_ID to keep
unique customer id to establish relation with other tables.
Question - Why did you choose certain things as attributes? As keys?
Answer – I added new field “CUST_ID” as primary key to establish relation with sales table. I
added CUST_ID because it will contain alphanumeric unique keys.
Question - What were the hardest decisions you had to make in this design process?
Answer – Added new field “CUST_ID” to establish relation with sales table to avoid using
composites key (FirstName + LastName). Similar FN and LN can create problem.
Question - How does your schema design support data independence?
Answer – Schema designed in relational model and it support new data column addition,
deletion and update without impacting existing data in the table. Current schema designed
support both logical and physical data independency.
Question - How may your schema design support the overarching goals of data curation (revisit
objectives and activities of Week 1)?
Answer – Schema designed in relational model and relation schema support overreaching goal
of data curation.
Question - Which curation activities could enhance or sustain the database for future discovery
and use for new purposes? What additional activities would you recommend?
Answer – Data curation activities like documentation, data authentication, archiving and
management will enhance and sustain the database for future discovery and use for new
purpose.
NOTE – Customer_Relation table schema, attribute’s description and data are stored in the
excel sheet Customer_Relation_Data and ER diagram in the ERDiagramAndSchema sheet.
Discount Details: Added new DISCOUNT table to store discount details. This will create minimal
errors when the discount details must be updated or adding new discount parameters.
The primary key is DISCOUNT_ID, which will be a foreign key in the Sales Table. The associated
attribute, DISCOUNT_DISCRIPTION and DISCONT_AMOUNT will have the information regarding
the kind of discount.
NOTE - Discount table schema, attribute’s description and data are stored in the excel sheet
Discount_Data and ER diagram in the ERDiagramAndSchema sheet.

Pros and Cons for the Schema -

Pros –

 Data redundancy is reduced

 Primary and foreign keys for the tables allow easy access to relevant data
 Provides a good level of abstraction in terms of representing the data being used by each
department.

Cons –

 Some of the data in the Customer Relation table is missing.

 Doesn't provide much extensibility in terms of the attributes of the vehicle.

Databases Assignment PDF
No ratings yet
Databases Assignment PDF
22 pages
Computer Science Paper 2 HL
No ratings yet
Computer Science Paper 2 HL
4 pages
Download
No ratings yet
Download
108 pages
DFo 3 4 Project
No ratings yet
DFo 3 4 Project
7 pages
Data Normalization
No ratings yet
Data Normalization
97 pages
Content
No ratings yet
Content
117 pages
Clarence Lee Swartz in Collaboration With The Mutualist Associates What Is Mutualism
No ratings yet
Clarence Lee Swartz in Collaboration With The Mutualist Associates What Is Mutualism
117 pages
20240820162324D5181_ISYS6843003 Week4 Session7 8 Designing the Database
No ratings yet
20240820162324D5181_ISYS6843003 Week4 Session7 8 Designing the Database
57 pages
Triton User Manual
No ratings yet
Triton User Manual
195 pages
Auto Mobile DataBase Project
No ratings yet
Auto Mobile DataBase Project
34 pages
Database Semester 1 Assignment
No ratings yet
Database Semester 1 Assignment
14 pages
663ae32024e04ae2f147cb9b DatabaseDesign
No ratings yet
663ae32024e04ae2f147cb9b DatabaseDesign
25 pages
Kid Toy
No ratings yet
Kid Toy
6 pages
Tutorial Acces 2010
No ratings yet
Tutorial Acces 2010
49 pages
Dbms Lab Manual
No ratings yet
Dbms Lab Manual
56 pages
Data Warehouse Management Systems
No ratings yet
Data Warehouse Management Systems
56 pages
Exam 2
No ratings yet
Exam 2
7 pages
Sample Exam ITC 423
No ratings yet
Sample Exam ITC 423
8 pages
DBMS-Lab_Manual
No ratings yet
DBMS-Lab_Manual
29 pages
Schema Diagram
No ratings yet
Schema Diagram
37 pages
Tejas 22-10-24
No ratings yet
Tejas 22-10-24
15 pages
21EC3023 - DBMS Lab Manual
No ratings yet
21EC3023 - DBMS Lab Manual
58 pages
Case Study For Lecture A: Group Tasks Task 1 ER Diagram 5 Marks
No ratings yet
Case Study For Lecture A: Group Tasks Task 1 ER Diagram 5 Marks
5 pages
Database Design Methodology
No ratings yet
Database Design Methodology
16 pages
Chapter 9 Revision
No ratings yet
Chapter 9 Revision
19 pages
Surname: Instructor: Course: Date
No ratings yet
Surname: Instructor: Course: Date
14 pages
2021 Test Memo Database
No ratings yet
2021 Test Memo Database
5 pages
Database Project Report 2025[1]
No ratings yet
Database Project Report 2025[1]
16 pages
Mis
No ratings yet
Mis
21 pages
Topic 2
No ratings yet
Topic 2
18 pages
Database and SQL Queries d - Copy (4)
No ratings yet
Database and SQL Queries d - Copy (4)
68 pages
DBMS
No ratings yet
DBMS
7 pages
Revision Paper - 09
No ratings yet
Revision Paper - 09
16 pages
Mis Lab
No ratings yet
Mis Lab
17 pages
4.10 Exam Practise Questions(Qs only)
No ratings yet
4.10 Exam Practise Questions(Qs only)
9 pages
Data Management and Database Design: INFO 6210 Week #4
No ratings yet
Data Management and Database Design: INFO 6210 Week #4
44 pages
P00186290 Ian Ngwalo DBDD
No ratings yet
P00186290 Ian Ngwalo DBDD
32 pages
Kid Toy
No ratings yet
Kid Toy
9 pages
IS222 2010 Sol
No ratings yet
IS222 2010 Sol
12 pages
Course Name: DBMS. Assignment I. Submission: Due To Deadline. Into Moodle LMS. Submission Format: Word or PDF Document Containing ER-Diagram of The
No ratings yet
Course Name: DBMS. Assignment I. Submission: Due To Deadline. Into Moodle LMS. Submission Format: Word or PDF Document Containing ER-Diagram of The
3 pages
20MCA102-SCHEME (6)
No ratings yet
20MCA102-SCHEME (6)
8 pages
Assignment DBMS: Mojahid Ali
No ratings yet
Assignment DBMS: Mojahid Ali
16 pages
Final BDM CAE 2 QB Ans
No ratings yet
Final BDM CAE 2 QB Ans
11 pages
W3C2 Assignment 2
No ratings yet
W3C2 Assignment 2
3 pages
JF 22-23 T2 COMP11007 Coursework Assignment Amended 03feb2023 Then On 20mar2023
No ratings yet
JF 22-23 T2 COMP11007 Coursework Assignment Amended 03feb2023 Then On 20mar2023
5 pages
DBMSL Chits2025
No ratings yet
DBMSL Chits2025
6 pages
Assignment Solution
No ratings yet
Assignment Solution
22 pages
Tutorial 3 DB U2000429
No ratings yet
Tutorial 3 DB U2000429
3 pages
IS222 Semester 1, 2012
No ratings yet
IS222 Semester 1, 2012
13 pages
sql
No ratings yet
sql
5 pages
Lab Chapter # 1 KOVID BEHL - N01579154 - ITC-5104-0NA
No ratings yet
Lab Chapter # 1 KOVID BEHL - N01579154 - ITC-5104-0NA
6 pages
Introductory Database Question Model 1. Brief Answer Questions: (1 Marks Each)
No ratings yet
Introductory Database Question Model 1. Brief Answer Questions: (1 Marks Each)
7 pages
Chapter 3 IM NOTES
No ratings yet
Chapter 3 IM NOTES
8 pages
Surname: Instructor: Course: Date
No ratings yet
Surname: Instructor: Course: Date
14 pages
Page 25 Onward
No ratings yet
Page 25 Onward
6 pages
Case Study Format CC104 1
No ratings yet
Case Study Format CC104 1
2 pages
Sit 603 Database Design and Management
No ratings yet
Sit 603 Database Design and Management
5 pages
Amazon Data Analysis with SQL (1)
No ratings yet
Amazon Data Analysis with SQL (1)
4 pages
ITCS 385 - Database Management Systems Midterm Second Semester 2004/2005
No ratings yet
ITCS 385 - Database Management Systems Midterm Second Semester 2004/2005
8 pages
Ict 105 Assessment
No ratings yet
Ict 105 Assessment
6 pages
What's The Problem?: Relational Databases
No ratings yet
What's The Problem?: Relational Databases
14 pages
Animation - Lesson 1
No ratings yet
Animation - Lesson 1
30 pages
Oop Reviewer
No ratings yet
Oop Reviewer
36 pages
Internet Overview: K.K.Dhupar Sde (DX), Alttc
No ratings yet
Internet Overview: K.K.Dhupar Sde (DX), Alttc
40 pages
Iso 4032 2012 12
No ratings yet
Iso 4032 2012 12
14 pages
Magic Quadrant For Talent Management Suites
100% (1)
Magic Quadrant For Talent Management Suites
24 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
28 pages
Notes On Map Reading
No ratings yet
Notes On Map Reading
9 pages
Microsoft Access Assignment
71% (7)
Microsoft Access Assignment
3 pages
Lesson Plan Math 5
100% (3)
Lesson Plan Math 5
5 pages
Operate DB Application Final Exam
No ratings yet
Operate DB Application Final Exam
5 pages
Catalog Dahua-Smart-Parking-Solution V1.0 en 202111 (16P) 2
No ratings yet
Catalog Dahua-Smart-Parking-Solution V1.0 en 202111 (16P) 2
14 pages
Pharmaceutical Brochure
No ratings yet
Pharmaceutical Brochure
8 pages
Static Electricity Lab
No ratings yet
Static Electricity Lab
4 pages
CS Upgrades and Downgrades
No ratings yet
CS Upgrades and Downgrades
15 pages
CTRL +
No ratings yet
CTRL +
3 pages
SEDG Engineering Company Profile 2023
No ratings yet
SEDG Engineering Company Profile 2023
9 pages
5 Palamuru University
No ratings yet
5 Palamuru University
4 pages
Assignment-3 ERD DK21 Final
No ratings yet
Assignment-3 ERD DK21 Final
7 pages
LESSON 2-3 (2nd Quarter)
No ratings yet
LESSON 2-3 (2nd Quarter)
4 pages
Synchrotron and Neutron
No ratings yet
Synchrotron and Neutron
9 pages
ACE Opportunity Submission AWS Technology Partner Quick Guide
No ratings yet
ACE Opportunity Submission AWS Technology Partner Quick Guide
7 pages
Assignment - 3: ER Diagram For Pre-Owned Dealer Database: Tables Name: 1. Customer 2. Transactions
No ratings yet
Assignment - 3: ER Diagram For Pre-Owned Dealer Database: Tables Name: 1. Customer 2. Transactions
6 pages
Top 10 Questions and Answers About LED Drivers
No ratings yet
Top 10 Questions and Answers About LED Drivers
5 pages
Part 1 Canonicalization Process Part 6
No ratings yet
Part 1 Canonicalization Process Part 6
5 pages
Bio
No ratings yet
Bio
10 pages
DG Automation Unit
No ratings yet
DG Automation Unit
2 pages
CS498 Data Visualization Essay
No ratings yet
CS498 Data Visualization Essay
3 pages
XML Document of Purchase Order - : Name Type Details
No ratings yet
XML Document of Purchase Order - : Name Type Details
3 pages
Ankita Shrivastava Resume
No ratings yet
Ankita Shrivastava Resume
3 pages
Premier ddr4 2666 U Dimm v2
No ratings yet
Premier ddr4 2666 U Dimm v2
2 pages
Part 2 - Memo
No ratings yet
Part 2 - Memo
2 pages
Overcoming The Legacies of Dictatorship
No ratings yet
Overcoming The Legacies of Dictatorship
2 pages
Advance Paper Corp vs. Arma Traders Corp.
No ratings yet
Advance Paper Corp vs. Arma Traders Corp.
2 pages
Factsheet Industrial Trent 60 Us Lowres
No ratings yet
Factsheet Industrial Trent 60 Us Lowres
2 pages
Tableau 8.2 Training Manual: From Clutter to Clarity
From Everand
Tableau 8.2 Training Manual: From Clutter to Clarity
Larry Keller
No ratings yet

Assignment 1 - Narrative

Uploaded by

Assignment 1 - Narrative

Uploaded by

INVENTORY DATA FILE -

Question - Did you leave out any information? If so, why?

Question - Why did you choose certain things as attributes? As keys?

Pros and Cons for the Schema -

 Data redundancy is reduced

 Some of the data in the Customer Relation table is missing.

You might also like