0% found this document useful (0 votes)
626 views12 pages

DBMS Reviewer

Data and information are critical resources for organizations that enable competitive advantages. Data refers to raw facts that are recorded and stored, while information is data that has been organized into a meaningful context. There are three key attributes of information - it must be accurate, timely, and relevant. A database is a collection of logically related data designed to meet an organization's information needs. A database management system (DBMS) is software used to define, create, use and maintain a database. Common DBMSs include MySQL.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
626 views12 pages

DBMS Reviewer

Data and information are critical resources for organizations that enable competitive advantages. Data refers to raw facts that are recorded and stored, while information is data that has been organized into a meaningful context. There are three key attributes of information - it must be accurate, timely, and relevant. A database is a collection of logically related data designed to meet an organization's information needs. A database management system (DBMS) is software used to define, create, use and maintain a database. Common DBMSs include MySQL.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Data (information) is the backbone and most critical resource of an organization that enables managers

and organizations to gain a competitive edge

• Data - defined as known facts that could be recorded and stored on Computer Media. It is also defined
as raw facts from which the required information is produced

 Information is data that have been put into a meaningful and useful content and communicated
to a recipient who uses it to made decisions”.

three major key attributes of information:

accurate

timely and relevant

Accuracy : It means that the information is free from errors, and it clearly and accurately reflects the
meaning of data on which it is based

Accuracy : It means that the information is free from errors, and it clearly and accurately reflects the
meaning of data on which it is based

Relevancy : It means the usefulness of the piece of information for the corresponding persons

 Database - is a collection of interrelated data stored together with controlled redundancy to


serve one or more applications in an optimal way.

-defined as a collection of logically related data stored together that is designed to meet
information requirements of an organization.

Fields - It is the smallest unit of the data that has meaning to its users and is also called data item or
data element.

Records - A record is a collection of logically related fields and each field is possessing a fixed number
of bytes and is of fixed data type

Files - A file is a collection of related records. Generally, all the records in a file are of same size and
record type but it is not always true

Database Management System (DBMS) - is the software package used to define, create, use, and
maintain a database. It typically consists of several software odules, each with their own functionality.

MySQL is a well-known open-source DBMS.

The combination of a DBMS and a database is then often called a database system

File-based approach or traditional file system - In the early days of computing, every application stored
its data into its own dedicated files
Disadvantages of Traditional File System

1. Data Redundancy 2. Data Inconsistency 3. Lack of Data Integration 4. Program Dependence 5.


Data Dependence 6. Limited Data Sharing 7. Poor Data Control 8. Problem of Security 9. Data
Manipulation Capability is Inadequate 10. Needs Excessive Programming

Advantages of Database Approach

1. Controlled redundancy 2. Data consistency 3. Program data independence 4. Sharing of data


5. Enforcement of standards 6. Improved data integrity 7. Improved security

Advantages of Database Systems (DBMS’s)


8. Data access is efficient 9. Conflicting requirements can be balanced 10. . Improved backup
and recovery facility 11. Minimal program maintenance 12. Data quality is high 13. Good
data accessibility and responsiveness 14. Concurrency control 15. Economical to scale 16.
Increased programmer productivity

Disadvantages of Database Systems

1. Complexity increases 2. Requirement of more disk space 3. Additional cost of hardware 4. Cost
of conversion 5. Need of additional and specialized manpower 6. Need for backup and recovery
7. Organizational conflict 8. More installation and management cost

The database model or database schema - provides the description of the database data at
different levels of detail and specifies the various data items, their characteristics, and
relationships, constraints, storage details, etc.
--- It is specified during database design and is not expected to change frequently. It is stored
in the catalog, which is the heart of the DBMS.

 database model - comprises different data models, each describing the data from different
perspectives.

Types of Data Model


 conceptual data model - provides a high-level description of the data items (e.g., supplier,
product) with their characteristics (e.g., supplier name, product number) and relationships (e.g.,
a supplier can supply products).
 logical data model is a translation or mapping of the conceptual data model toward a specific
implementation environment.
---can be mapped to an internal data model that represents the data’s physical storage details. It
clearly describes which data are stored where, in what format, which indexes are provided to
speed up retrieval.
 external data model contains various subsets of the data items in the logical model, also called
views, tailored toward the needs of specific applications or groups of users.
The Three Layer Architecture
1. Conceptual/logical layer - Both focus on the data items, their characteristics, and relationships
without bothering too much about the actual physical DBMS implementation.
2. External layer It has the external data model, which includes views offering a window on a
carefully selected part of the logical data model. A view describes the part of the database that
a particular application or user group is interested in, hiding the rest of the database.
3. Internal layer includes the internal data model, which specifies how the data are stored or
organized physically. Ideally, changes in one layer should have no to minimal impact on the
others

Catalog is the heart of the DBMS.

It contains the data definitions, or metadata, of your database application. It stores the definitions
of the views, logical and internal data models, and synchronizes these three data models to ensure
their consistency

Database Users

• Information Architect – He/she designs the conceptual data model.

He/she closely interacts with the business user to make sure the data requirements are fully understood
and modeled.

• Database Designer – He/she translates the conceptual data model into a logical and internal data
model.

• Database Administrator (DBA) – He/she is responsible for the implementation and monitoring of the
database.

He/she sets up the database infrastructure and continuously monitors its performance by inspecting
key performance indicators such as response times, throughput rates, and storage space consumed.

• Application Developer – He/she develops database applications in a general purpose programming


language such as Java or Python.

He/she provides the data requirements, which are then translated by the database designer or DBA into
view definitions

• End users – He/she will run these applications to perform specific database operations.

He/she can also directly query the database using interactive querying facilities for reporting purposes

Database Languages
• Data Definition Language (DDL) – used by the DBA to express the database’s external, logical, and
internal data models. These definitions are stored in the catalog.

• Data Manipulation Language (DML) – used to retrieve, insert, delete, and modify data. DML
statements can be embedded in a general-purpose programming language, or entered interactively
through a front-end querying tool.

Components of DBMS

1. connection manager

– provides facilities to set-up a database connection. It can be set-up locally or through a


network, the latter being more common. It verifies the logon credentials, such as user
name and password, and returns a connection handle.

2. DDL compiler

 compiles the data definitions specified in DDL.

3. Query processor

– one of the most important parts of a DBMS. It assists in the execution of database
queries such as retrieval of data, insertion of data, update of data, and removal of data
from the database.

Query Processor consists of:

1. DML compiler compiles the data manipulation statements specified in DML.

2. Query parser -parses the query into an internal representation format that can then be further
evaluated by the system.

3. query rewriter -optimizes the query, independently of the current database state.

4. query optimizer -. It optimizes the query based upon the current database state

5. query executor -The result of the query optimization procedure is a final execution plan and
takes care of the actual execution by calling on the storage manager to retrieve the data
requested

6. storage manager - governs physical file access and as such supervises the correct and efficient
storage of data. It consists of a transaction manager, buffer manager, lock manager, and
recovery manager.

7. transaction manager - supervises the execution of database transactions.

8. buffer manager - is responsible for managing the buffer memory of the DBMS.
9. lock manager - is an essential component for providing concurrency control, which ensures
data integrity at all times.

10. recovery manager - supervises the correct execution of database transactions.

To facilitate this communication between Dbms and other parties, it provides various user interfaces
such as a web-based interface, a stand-alone query language interface, a command-line interface, a
forms-based interface, a graphical user interface, a natural language interface, an application
programming interface (API), an admin interface, and a network interface.

Categorization Based on Data Model

1. Hierarchical DBMSs were one of the first DBMS types developed, and adopt a tree-like data
model. The DML is procedural and record-oriented. No query processor is included.

2. Network DBMSs use a network data model, which is more flexible than a treelike data model.
One of the most popular types are CODASYL DBMSs, which implement the CODASYL data model

3. Relational DBMSs (RDBMSs) use the relational data model and are the most popular in the
industry. They typically use SQL for both DDL and DML operations. SQL is declarative and set
oriented.

4. Relational DBMSs (RDBMSs) use the relational data model and are the most popular in the
industry. They typically use SQL for both DDL and DML operations. SQL is declarative and set
oriented.

5. Object-relational DBMS (ORDBMS), also commonly called an extended relational DBMS


(ERDBMS), uses a relational model extended with objectoriented concepts, such as user-defined
types, user-defined functions, collections, inheritance, and behavior.

6. XML DBMSs use the XML data model to store data. XML is a data representation standard.

7. NoSQL DBMSs - Finally, the last few years brought us a realm of new database technologies
targeted at storing big and unstructured data

• Centralized DBMS architecture, the data are maintained on a centralized host, e.g., a mainframe
system. All queries will then have to be processed by this single host.

• Client–server DBMS architecture, active clients request services from passive servers. A fat
client variant stores more processing functionality on the client, whereas a fat server variant
puts more on the server.

• n-tier DBMS architecture is a straightforward extension of the client–server architecture.


• Cloud DBMS architecture, the DBMS and database are hosted by a third-party cloud provider.

• Federated DBMS is a DBMS that provides a uniform interface to multiple underlying data
sources such as other DBMSs, file systems, document management systems, etc.

• In-memory DBMS stores all data in internal memory instead of slower external storage such as
disk-based storage

Categorization Based on Usage

• On-line transaction processing (OLTP) DBMSs focus on managing operational or transactional


data.

• On-line analytical processing (OLAP) DBMSs focus on using operational data for tactical or
strategical decision making.

• Multimedia DBMSs allow for the storage of multimedia data such as text, images, audio, video,
3D games, CAD designs, etc.

• Spatial DBMS supports the storage and querying of spatial data.

• Sensor DBMS manages sensor data such as biometric data obtained from wearables, or
telematics data which continuously record driving behavior.

• Mobile DBMSs are the DBMSs running on smartphones, tablets, and other mobile devices.

• Open-source DBMSs are DBMSs for which the code is publicly available and can be extended by
anyone.

____________________________________________________________________________________

• entity-relationship (E-R) model was introduced by Chen in 1976. He described the main
constructs of the E-R model i.e., entities, relationships and their associated attributes.

• Entity refers to an “object” or “thing” in real world. Object may be any person, place,
event etc.

• Attributes - are the characteristics of any entity.

• Value is the information or data which is stored in attributes of any entity.

• All the entities having same attributes make an entity set.

• Domain or value set is the set of all values or information about any attribute.
Types of Attributes

• Simple and Composite Attributes

– Simple Attributes are those which cannot be divided into subparts.

– Composite Attributes are those which can be divided into subparts

• Single Valued and Multi-valued Attributes

– Single Valued Attribute : An attribute having only single value for a particular entity is
known as single value attribute.

– Multi-Valued Attributes : An attribute having more than one possible value for a
particular entity is known as multi-valued attribute

• Derived Attributes and Stored Attributes

– Derived Attributes : An attribute that can be derived from other known attributes is
known as derived attribute.

– Stored Attributes : An attribute which cannot be derived by other known attributes is


known as stored attribute

Relationship Sets

• Relationship : A relationship is the association among several entities. It connects different


entities through a meaningful relation

• Relationship Set : A relationship set is a set of relationships of the same type

Degree of Relationship Sets

• Binary Relationship Set - A relationship set in which only two entity sets are involved is known
as binary relationship set.

• Ternary Relationship Set - A relationship set in which three entity sets are involved is known as
ternary relationship set or a relationship set having degree three.

• Role : The function of any entity which it plays in relationship set is called that entity’s role.

• Recursive Relationship Set : When the same entity sets participate in same relationship set
more than once with different roles each time, then this type of recursive relationship set is
known as Recursive Relationship set

• Mapping Cardinalities (Cardinality Ratios)


–One to One (1 : 1)

–One to Many (1 : N)

–Many to One (N : 1)

–Many to Many (M : N)

• A key is an attribute or set of attributes that is used to identify data in entity sets.

• The attributes which are used as key are known as key attributes. Rest of all are known as Non-
key attributes.

• Super Key : A super key is a set of collection of one or more than one attributes that can
identify data uniquely.

• Candidate Key : The minimal super key is known as candidate key. Consider a super key
and then take all of its proper subsets

• Primary Key - An attribute which identifies data uniquely is known as Primary Key. OR The term
Primary Key is used to denote Candidate key. Any entity set can have more than one Candidate
key but only one Primary Key.

• Alternate Keys : All the candidate keys other than Primary Key are known as Alternate Keys.

• Secondary Key : An attribute or set of attributes which doesn’t identify data uniquely but
identifies a group of data is known as secondary key.

• Foreign Key : A foreign key is an attribute in any entity set which is also a Primary Key in any
other entity set.

_____________________________________________________________________________

• E/R Modelling is used for conceptual design

– Entities - objects or items of interest

– Attributes - facts about, or properties of, an entity

– Relationships - links between entities

• E/R Models are often represented as E/R diagrams that

– Give a conceptual view of the database

• Entities represent objects or things of interest


– Physical things like students, lecturers, employees, products

– More abstract things like modules, orders, courses, projects

– an entity is usually drawn as a box with rounded corners

– The box is labelled with the name of the class of objects represented by that entity

• Attributes are facts, aspects, properties, or details about an entity

– Students have IDs, names, courses, addresses, …

– Modules have codes, titles, credit weights, levels, …

– attributes may be drawn as ovals

• Relationships are an association between two or more entities

MODULE4

 Data management - entails the proper management of data and the corresponding
data definitions or metadata.

-It aims at ensuring that (meta-)data are of good quality and thus a key resource for
effective and efficient managerial decision-making.

 metamodel is a data model for metadata. A meta-model determines the type of


metadata that can be stored
 Data quality (DQ) is often defined as “fitness for use”, which implies the relative
nature of the concept.
---is a multidimensional concept in which each dimension represents a single aspect or
construct, comprising both objective and subjective perspectives.
 DQ framework categorizes the different dimensions of data quality
• The following are common causes of poor DQ:

– Multiple data sources: multiple sources with the same data may produce duplicates – a
problem of consistency.

– Subjective judgment in data production: data production using human judgment (e.g.,
opinions) can cause the production of biased information – a problem of objectivity.

– Limited computing resources: lack of sufficient computing resources and/or


digitalization may limit the accessibility of relevant data – a problem of accessibility.

– Volume of data: large volumes of stored data make it difficult to access needed
information in a reasonable time – a problem of accessibility.

– Changing data needs: data requirements change on an ongoing basis due to new
company strategies or the introduction of new technologies – a problem of relevance.

The aim of data governance-- is to set-up a company-wide controlled and supported approach
toward DQ accompanied by DQ management processes

examples of such frameworks

• Total Data Quality Management (TDQM), Total Quality Management (TQM), Capability Maturity
Model Integration (CMMI), ISO 9000, Control Objectives for Information and Related Technology
(CobiT), Data Management Body of Knowledge (DMBOK), Information Technology Infrastructure
Library (ITIL), and Six Sigma.
Roles in Data Management

• information architect (also called information analyst) designs the conceptual data model,
preferably in dialogue with the business users.

• database designer translates the conceptual data model into a logical and
internal data model.
• Every data field in every database in the organization should be owned by a data owner, who
has the authority to ultimately decide on the access to, and usage of, the data.

• Data stewards are the DQ experts in charge of ensuring the quality of both the actual business
data and the corresponding metadata.

• database administrator (DBA) is responsible for the implementation and monitoring of the
database.

• Data scientist is a relatively new job profile within the context of data management.

You might also like