0% found this document useful (0 votes)
28 views26 pages

Lecture06 MIS

Chapter 6 discusses the challenges of managing data in traditional file environments, including data redundancy, program-data dependence, inflexibility, poor security, and lack of data sharing. It highlights the advantages of database management systems (DBMS), particularly relational DBMS, in centralizing data, reducing redundancy, and facilitating ad hoc queries. Additionally, it covers the significance of big data, business intelligence infrastructure, and analytical tools for improving decision-making and business performance.

Uploaded by

haneenelasawy335
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views26 pages

Lecture06 MIS

Chapter 6 discusses the challenges of managing data in traditional file environments, including data redundancy, program-data dependence, inflexibility, poor security, and lack of data sharing. It highlights the advantages of database management systems (DBMS), particularly relational DBMS, in centralizing data, reducing redundancy, and facilitating ad hoc queries. Additionally, it covers the significance of big data, business intelligence infrastructure, and analytical tools for improving decision-making and business performance.

Uploaded by

haneenelasawy335
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 6: What are the problems of managing

data resources in a traditional file environment?


• An effective information system provides users with
accurate, timely, and relevant information.
• Accurate information is free of errors.
• Information is timely when it is available to decision
makers when it is needed.
• Information is relevant when it is useful and
appropriate for the types of work and decisions that
require it.
• many businesses don’t have timely, accurate, or relevant
information because the data in their information systems have
been poorly organized and maintained. That’s why data
management is so essential?
• A computer system organizes data in a hierarchy that starts with
bits and bytes and progresses to fields, records, files, and
databases.
• A bit represents the smallest unit of data a computer can
handle.
• A group of bits, called a byte , represents a single character,
which can be a letter, a number, or another symbol.
• A grouping of characters into a word, a group of words, or a
complete number (such as a person’s name or age) is called a
field .
• A group of related fields, such as the student’s name, the course
taken, the date, and the grade, comprises a record ; a group of
records of the same type is called a file .
Problems with the Traditional File
Environment
Problems with the Traditional File
Environment
• data redundancy and inconsistency,
• program-data dependence,
• inflexibility,
• poor data security,
• an inability to share data among applications.
Data Redundancy and Inconsistency
• Data redundancy is the presence of duplicate
data in multiple data files so that the same data
are stored in more than one place or location.
• Data redundancy occurs when different groups in
an organization independently collect the same
piece of data and store it independently of each
other.
• Data redundancy wastes storage resources and
also leads to data inconsistency , where the same
attribute may have different values.
Program-Data Dependence
• refers to the coupling of data stored in files and the
specific programs required to update and maintain
those files such that changes in programs require
changes to the data.
• Every traditional computer program has to describe
the location and nature of the data with which it
works.
• In a traditional file environment, any change in a
software program could require a change in the
data accessed by that program.
Lack of Flexibility
• A traditional file system can deliver routine scheduled
reports after extensive programming efforts, but it
cannot deliver ad hoc reports or respond to
unanticipated information requirements in a timely
fashion.
• The information required by ad hoc requests is
somewhere in the system but may be too expensive
to retrieve.
• Several programmers might have to work for weeks to
put together the required data items in a new file
Poor Security & Lack of Data Sharing
and Availability
• Because there is little control or management of data, access
to and dissemination of information may be out of control.
Management may have no way of knowing who is accessing or
even making changes to the organization’s data.
• Because pieces of information in different files and different
parts of the organization cannot be related to one another, it is
virtually impossible for information to be shared or accessed in
a timely manner.
• Information cannot flow freely across different functional
areas or different parts of the organization.
• If users find different values of the same piece of information
in two different systems, they may not want to use these
systems because they cannot trust the accuracy of their data.
What are the major capabilities of database
management systems (DBMS), and why is a
relational DBMS so powerful?
• database is a collection of data organized to serve many
applications efficiently by centralizing the data and
controlling redundant data.
• Rather than storing data in separate files for each
application, data appear to users as being stored in only
one location. A single database services multiple
applications.
Database Management Systems (DBMS)
• DBMS is software that permits an organization to centralize data, manage them
efficiently, and provide access to the stored data by application programs.
• The DBMS acts as an interface between application programs and the physical
data files. When the application program calls for a data item, such as gross pay,
the DBMS finds this item in the database and presents it to the application
program.
• A DBMS reduces data redundancy and inconsistency by minimizing isolated files
in which the same data are repeated. The DBMS may not enable the
organization to eliminate data redundancy entirely, but it can help control
redundancy.
• Even if the organization maintains some redundant data, using a DBMS
eliminates data inconsistency because the DBMS can help the organization
ensure that every occurrence of redundant data has the same values.
• users and programmers can perform ad hoc queries of the database for many
simple applications without having to write complicated programs.
• Data sharing throughout the organization is easier because the data are
presented to users as being in a single location rather than fragmented in many
different systems and files.
Operations of a Relational DBMS
1- The select operation: creates a subset consisting of all
records in the file that meet stated criteria. Creates a subset of
rows that meet certain criteria.
2- The join operation : combines relational tables to provide the
user with more information than is available in individual
tables.
3- The project operation creates a subset consisting of columns
in a table, permitting the user to create new tables that contain
only the information required.
Capabilities of Database Management
Systems
• data definition language: specify the structure of the content of the
database. It would be used to create database tables and to define the
characteristics of the fields in each table.
• data dictionary: is an automated or manual file that stores definitions of data
elements and their characteristics.
• data manipulation language: is used to add, change, delete, and retrieve the
data in the database. This language contains commands that permit end
users and programming specialists to extract data from the database to
satisfy information requests and develop applications such as Structured
Query Language , or SQL.
• The design of DB contains both a conceptual design and a physical design.
The conceptual, or logical design of a database is an abstract model of the
database from a business perspective, whereas the physical design shows
how the database is actually arranged on direct-access storage devices.
Normalization and entity relationship diagram is important in conceptual
design.
Non-relational Databases and
Databases in the Cloud
• Non-relational database management systems use a more flexible
data model and are designed for managing large data sets across
many distributed machines and for easily scaling up or down.
• They are useful for accelerating simple queries against large volumes
of structured and unstructured data, including web, social media,
graphics, and other forms of data that are difficult to analyze with
traditional SQLbased tools.
• There are several different kinds of NoSQL databases, each with its
own technical features and behavior. Oracle NoSQL Database is one
example, as is Amazon’s SimpleDB, one of the Amazon Web Services
that run in the cloud. SimpleDB provides a simple web services
interface to create and store multiple data sets, query data easily, and
return the results. There is no need to predefine a formal database
structure or change that definition if new data are added later.
Cloud Databases
• Amazon and other cloud computing vendors provide relational database
services as well. Amazon Relational Database Service (Amazon RDS) offers
MySQL, SQL Server, Oracle Database, PostgreSQL, MariaDB, or Amazon
Aurora DB (compatible with MySQL) as database engines. Pricing is based
on usage.
• Oracle has its own Database Cloud Services using its relational Oracle
Database, and Microsoft Windows SQL Azure Database is a cloud-based
relational database service based on Microsoft’s SQL Server DBMS.
• Cloud-based data management services have special appeal for web-
focused start-ups or small to medium-sized businesses seeking database
capabilities at a lower price than in-house database products.
• In addition to public cloud-based data management services, companies
now have the option of using databases in private clouds. For example,
Sabre Holdings, the world’s largest software as a service (SaaS) provider for
the aviation industry, has a private database cloud that supports more than
100projects and 700 users
What are the principal tools and technologies
for accessing information from databases to
improve business performance and decision making?
• Big Data: describe semi and unstructured data sets with volumes
so huge that they are beyond the ability of typical DBMS to
capture, store, and analyze.
• Semi and unstructured data are such as : data from web traffic, e-
mail messages, and social media content (tweets, status
messages), as well as machine-generated data from sensors (used
in smart meters, manufacturing sensors, and electrical meters) or
from electronic trading systems.
• Big data doesn’t refer to any specific quantity but usually refers to
data in the petabyte and exabyte range—in other words, billions
to trillions of records, all from different sources.
• Businesses are interested in big data because they can reveal
more patterns and interesting relationships than smaller data sets,
with the potential to provide new insights into customer behavior,
weather patterns, financial market activity, or other phenomena.
Business Intelligence Infrastructure
• A contemporary infrastructure for business
intelligence has an array of tools for obtaining
useful information from all the different types of
data used by businesses today, including semi-
structured and unstructured big data in vast
quantities.
• These capabilities include: data warehouses and
data marts, Hadoop, in-memory computing, and
analytical platforms. Some of these capabilities
are available as cloud services.
Data Warehouses and Data Marts
• A data warehouse is a database that stores current and
historical data of potential interest to decision makers
throughout the company.
• data are combined with data from external sources and
transformed by correcting inaccurate and incomplete data
and restructuring the data for management reporting and
analysis before being loaded into the data warehouse.
• A data warehouse system also provides a range of ad hoc
and standardized query tools, analytical tools, and graphical
reporting facilities.
• A data mart is a subset of a data warehouse in which a
summarized or highly focused portion of the organization’s
data is placed in a separate database for a specific
population of users.
Hadoop
• Hadoop : handles unstructured and semi-structured data in
vast quantities, as well as structured data, organizations
• It is an open source software framework managed by the
Apache Software Foundation that enables distributed parallel
processing of huge amounts of data across inexpensive
computers.
1- It breaks a big data problem down into sub-problems,
2- it distributes them among up to thousands of inexpensive
computer processing nodes.
3- then, it combines the result into a smaller data set that is easier
to analyze.
• Examples: You’ve probably used Hadoop to find the best airfare
on the Internet, get directions to a restaurant, do a search on
Google, or connect with a friend on Facebook.
In-Memory Computing
• It relies primarily on a computer’s main memory (RAM)
for data storage.
• Conventional DBMS use disk storage systems.
• Users access data stored in system primary memory,
thereby eliminating bottlenecks from retrieving and
reading data in a traditional, disk-based database and
dramatically shortening query response times.
• Complex business calculations that used to take hours
or days are able to be completed within seconds, and
this can even be accomplished using handheld devices.
Analytic Platforms
• Analytic platforms also include in-memory
systems and NoSQL non-relational database
management systems.
• Analytic platforms are now available as cloud
services.
Analytical Tools: Relationships,
Patterns, Trends
• Once data have been captured and organized
using the business intelligence technologies,
they are available for further analysis using
software for database querying and reporting,
multidimensional data analysis (OLAP), and
data mining.
• Online Analytical Processing (OLAP): enables
users to view the same data in different ways
using multiple dimensions.
Data mining
• Data mining is more discovery-driven. Data
mining provides insights into corporate data that
cannot be obtained with OLAP by finding hidden
patterns and relationships in large databases and
inferring rules from them to predict future
behavior.
• The patterns and rules are used to guide decision
making and forecast the effect of those decisions.
• The types of information obtainable from data
mining include associations, sequences,
classifications, clusters, and forecasts.

You might also like