Module 3
Data and Business
Intelligence
Bidgoli, MIS, 11th Edition. © 2023 Cengage. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Learning Objectives
• Define a database and a database management system (DBMS)
• Explain logical database design and the relational database model
• Define the components of a DBMS
• Summarize recent trends in database design and use
• Explain the major components and functions of a data warehouse and
their use for business
• Describe the functions of a data mart
• Explain big data and its business applications
• Explain database marketing and its business applications
2
Databases (1 of 4)
• Database
• Collection of related data that can be stored in a central location or in
multiple locations
• Usually a group of files
• File
• Group of related records
• All files are integrated → information can be linked
• Record
• Group of related fields
• Data hierarchy
• Structure and organization of data, which involves fields, records, and files
3
Databases (2 of 4)
– For the above example, fields consist of social security number, student
name, and address
– All the fields storing information for Mary Smith, for instance, constitute a
record
– All the three records make up the student file
4
Databases (3 of 4)
• Critical component of information systems
• Any type of analysis that’s done is based on data available in the database
• Database management system (DBMS)
• Software for creating, storing, maintaining, and accessing database files
• To make using databases more efficient
• “Flat files” system in the past
• Data wasn’t arranged in a hierarchy
• No relation among the “flat files”
• Same data could be stored in more than one file, creating data redundancy
5
Databases (4 of 4)
• “Flat files” system in the past
• Data redundancy takes up unnecessary storage spaces
• Data not updated in all file consistently, resulting in conflicting reports
generated from these files
• Database advantages over a flat file system
• Generate more information from the same data
• Handle complex requests more easily
• Reduce data redundancy
• Reduce storage space
• Easily maintain relationships among data
• More sophisticated security measures
6
Exhibit 3.2
Interaction between the User, DBMS, and Database
i. The user issues a request
ii. DBMS searches the database
iii. DBMS returns the information to the user 7
Types of Data in a Database
• Internal data
• Collected from within an organization
• E.g. transaction records, sales records, and so forth
• Stored in the organization’s internal databases
• External data
• Comes from a variety of sources
• E.g. competitors, customers, suppliers, etc.
• Often stored in a data warehouse
8
Logical Database Design (1 of 2)
• Information is viewed in a database in two ways
• Physical view
• How data is stored on and retrieved from storage media such as
hard disks, CDs, etc.
• Only one physical view of data for each database
• Logical view
• How information appears to users
• How it can be organized and retrieved
• Depending on the user, there can be more than one logical view of
data
9
Logical Database Design (2 of 2)
• Data model
• Determines how data is created, represented, organized, and
maintained
• Includes:
• Data structure
• Operations
• Integrity rules
10
The Relational Model (1 of 5)
• Relational model
• Uses a two-dimensional table of rows and columns of data
• Rows are records and columns are fields (i.e., attributes)
• Data dictionary
• Stores definitions of each table and the fields in it for the logical
structure of a relational database
• Field name – e.g. student name, age, admission data, etc.
• Field data type – e.g. text, date, and number
• Default value – e.g. value entered if none is available
• Validation rule – e.g. determine whether a value is valid
11
Relational Model (2 of 5)
• Primary key
• Unique identifier (e.g. student ID) for every record
• Foreign key
• Primary key for one table appears in other tables
• Establishes relationships (i.e. data can be linked and retrieved amongst
tables) between tables
• Normalization
• Process to improve database efficiency
• Eliminates redundant data and ensures only related data is stored in a
table
• E.g. storing customer names in only one table
12
Relational Model (3 of 5)
• Data stored in a relational model is retrieved by using
operations that pick and combine data from one or more
tables
• Data retrieval operations
• Select: searches data in a table and retrieves records based on
certain criteria or conditions
• Project: pares down a table by eliminating columns (fields)
according to certain criteria
• Join: combines two tables based on a common field
13
Relational Model (4 of 5)
• Data retrieval examples
• Select operation
• Project operation
14
Relational Model (5 of 5)
• Data retrieval example – join operation
15
Components of a DBMS
• DBMS software components
• Database engine
• Data definition
• Data manipulation
• Application generation
• Data administration
16
Database Engine
• Heart of DBMS software
• Responsible for data storage, manipulation, and retrieval
• Converts logical requests from users into their physical
equivalents (e.g. reports) by interacting with other
components of the DBMS
17
Data Definition
• Create and maintain the data dictionary
• Define the structure of files in a database
• Make changes to a database’s structure, such as:
• Adding fields
• Deleting fields
• Changing field size
• Changing data type
18
Data Manipulation
• Add, delete, modify, and retrieve records from a database
• A query language is used:
• Structured Query Language (SQL)
• Standard fourth-generation query language used by many DBMS
packages
• Uses keywords (e.g. the SELECT statement) to specify actions to
take
• Query by example (QBE)
• Construct a statement made up of query forms
• Graphical interface
19
Application Generation
• Design elements of an application using a database, such as:
• Data entry screens
• Interactive menus
• Interfaces with other programming languages
• Create a form or generate a report, for example
• Typically used by IT professionals and database administrators
20
Data Administration
• Used by IT professionals and database administrators for:
• Backup and recovery
• Security
• Change management
• Determine who has permission to perform: Create, read,
update, and delete (CRUD)
• Database administrator (DBA)
• Responsible for database design and management
• Can be an individual or department (for complex database)
21
Recent Trends in Database Design and Use
• Include:
• Data-driven Web sites
• Natural language processing (details will be covered in Chapter
13)
• Distributed databases
22
Data-Driven Web Sites
• Data-driven Web site
• Interface to a database
• Retrieves data and allows users to enter data
• i.e. provide dynamic content (requires no change to the HTML code of
the web page)
• Improves access to information
• User’s experiences are more interactive
• Useful for:
• E-commerce sites that need frequent updates
• News sites that need regular updating of content
• Forums and discussion groups
• Subscription services, such as newsletters
23
Distributed Databases
• Distributed database
• Data is stored on multiple servers placed throughout an organization
(in contrast to central database for all users)
• Main reasons for choosing
• Minimizes the effects of computer failures
• Helps reduce communication costs for remote users
• Supports distributed processing
• Not limited by data’s physical location
• Security issues are concerned
• Multiple access points from inside and outside the organization
24
Data Warehouses
• Data warehouse
• Collection of data (from a variety of sources) used to support
decision-making applications and generate business intelligence
• Stores multidimensional data (i.e. hypercubes)
• Characteristics of data stored in data warehouse
• Subject oriented (focused on a specific area)
• Integrated (comes from different sources)
• Time variant (categorized based on time, i.e. historical)
• Type of data (capture aggregated data)
• Purpose (for analytical use)
25
3.6 A Data Warehouse Configuration
Four major components:
i. Input
ii. Extraction, transformation, and loading (i.e. ETL)
iii. Storage
iv. Output
26
Input
• Data comes from a variety of sources:
• External data sources
• Databases
• Transaction files
• ERP (enterprise resource planning) systems
• CRM (customer relationship management) systems
27
Extraction, Transformation, and Loading (ETL)
• Extraction
• Collecting data from a variety of sources
• Converting data into a format that can be used in transformation
processing
• Transformation processing
• Make sure data meets the data warehouse’s needs
• Loading
• Process of transferring data to the data warehouse
28
Storage
• Collected information is organized in a data warehouse as:
• Raw data (information in original form)
• Summary data (subtotals of various categories)
• Metadata (information about data)
29
Output (1 of 2)
• Data warehouse supports different types of analysis
• Generates reports for decision making
• Online analytical processing (OLAP)
• Generates business intelligence
• Uses multiple sources of information and provides
multidimensional analysis
• Hypercube (similar to multidimensional spreadsheet)
• Performs trend analysis
• Drill down and drill up features for accessing multilayer
information
30
3.7 Slicing and Dicing Data
Image credit: OLAP.com
31
Output (2 of 2)
• Data-mining analysis
• Discover patterns and relationships
• Reports for decision making
====================================================
• A data warehouse can allow you to do:
• Cross-reference segments of an organization’s operations for
comparison purposes
• Find patterns and trends that can’t be found with databases
• Analyze large amounts of historical data quickly
32
Data Mart (1 of 2)
• Data mart
• Smaller version of data warehouse
• Used by single department or function
• Advantages over data warehouses
• Users are targeted better as it is designed for a specific
department or division
• Faster data access because of smaller size
• Less expensive
• Easier to create because of its size and simplicity
33
Data Marts (2 of 2)
• Disadvantages
• Limited scope than data warehouses
• Consolidating information from different departments or
functional areas is more difficult
34
The Big Data Era (1 of 2)
• Big data: voluminous data which the conventional computing
methods are unable to efficiently process and manage it
• Many technologies and applications have contributed to
growth and popularity
• Mobile and wireless technology, the popularity of social
networks, etc.
35
The Big Data Era (2 of 2)
• Involves five dimensions known as 5 Vs
• Volume: Quantity of transactions
• Variety: Combination of structured and unstructured data
• Velocity: Speed with which data needs to be gathered and
processed
• Veracity: Trustworthiness and accuracy of the data
• Value: Values that the collected data brings to the decision-
making process
36
Database Marketing (1 of 2)
• Uses an organization's database of customers and potential
customers to promote products or services
• Main goal: use information within the database to implement
marketing strategies
• Increase profits
• Enhance competitiveness
37
Database Marketing (2 of 2)
• Tasks performed by successful database marketing campaigns
• Calculating customer lifetime value (CLTV)
• Conducting recency, frequency, and monetary analysis (RFM)
• Using different techniques to communicate effectively with
customers
• Using different techniques to monitor customer behavior across
a number of retail channels, including organization's Web site,
mobile apps, and social media
38
Summary
• Components of a DBMS are database engine, data definition, data
manipulation, application generation, and data administration
• Recent trends in database design are data-driven Web sites, natural
language processing, and distributed databases
• Data warehouse is a collection of data from a variety of sources
• Data marts focus on business functions for a specific user group in an
organization
• Industries gain a competitive advantage from big data analytics
39
Key Terms
• Database management system (DBMS) • Query by example (QBE)
• Flat files system • Data-driven website
• Data redundancy • Distributed database
• Physical view • Data warehouse
• Logical view • Hypercube
• Relational model • Online analytical processing (OLAP)
• Data dictionary • Data-mining analysis
• Primary key • Data mart
• Foreign key • Big data
• Data retrieval operations • Database marketing
40
Key Concepts
• The join operation in a relational database combines two tables based on a
common field, which helps to reduce data redundancy in conventional “flat
files” systems.
• A distributed database stores data on multiple servers, so as to minimize the
effects of computer failure, reduce communication costs for remote users,
and support distributed processing.
• A data warehouse collects data from a variety of sources to support (through
OLAP and data-mining analysis) decision-making applications and generate
business intelligence.
• Data-mining analysis in the data warehouse helps to discover patterns,
relationships, and trends that cannot be found with single databases alone.
• Big data is powerful not only because its volume is big, but also because it
includes a variety of both structured and unstructured data.
41
Reviewed Exercise
42
43