CS 117
Applications of ICT
Database Systems
1
Some basic concepts
• Data
– Structured
• Arithmetic and logical Operations can be directly applied
• E.g., numeric data
– Un-Structured
• Arithmetic and logical Operations can NOT be directly
applied
• Audio, images, video data
• Information
Some basic concepts
• Metadata
– Data about data
• Information Systems
– Database Part
– Application Part
• E.g.,
– Management Information Systems
– Geographical Information Systems
Database (Definition)
In the broadest sense, a database is anything that stores
data. A phone book, for instance, could be considered a
database as it stores related pieces of information such as
name and phone number.
However, in the world of computers, a database
usually refers to a collection of related pieces of
information stored electronically. Aside from the ability to
store data, a database also provides a way for other
computer programs to quickly retrieve and update desired
pieces of data.
Reference:
http://www.mariosalexandrou.com/definition/database.asp
4
Database (Definition)
A database is an application that manages
data and allows fast storage and retrieval
of that data.
Reference
http://cplus.about.com/od/glossar1/g/databasedefn.htm
5
Database (Definition)
A database is a collection
of information that is organized so that it
can easily be accessed, managed, and
updated.
Reference:
http://searchsqlserver.techtarget.com/sDefinition/0,,sid87
_gci211895,00.html
6
Definitions (By Hoffer)
• Database (Concluded): organized collection of
logically related data
7
Descriptions of the properties or
characteristics of the data, including data
types, field sizes, allowable values, and
data context
8
Why Databases??
• Why not store everything on flat files: use
the file system of the OS, cheap/simple…
Name, Course, Grade
John Smith, CS112, B
Mike Stonebraker, CS234, A
Jim Gray, CS560, A
John Smith, CS560, B+
…………………
• Yes, but has many problems…
Problem 1
• Data Organization
– redundancy and inconsistency
• Multiple file formats, duplication of information in different files
Name, Course, Email, Grade
John Smith, [email protected], CS112, B
Mike Stonebraker, [email protected], CS234, A
Jim Gray, CS560, [email protected], A
John Smith, CS560, [email protected], B+
Why this is a problem?
• Wasted space
• Potential inconsistencies (multiple formats, John Smith vs Smith J.)
Problem 2
• Data retrieval:
– Find the students registered for CS117
– Find the students with GPA > 3.5
For every query we need to write a program!
• We need the retrieval to be:
– Easy to write
– Execute efficiently
Problem 3
• Data Integrity
– No support for sharing:
• Prevent simultaneous modifications
– No coping mechanisms for system crashes
– No means of Preventing Data Entry Errors (checks
must be hard-coded in the programs)
– Security problems
• Database systems offer solutions to all the above
problems
The DATABASE Approach
• Central repository of shared data
• Data is managed by a controlling
agent
• Stored in a standardized, convenient
form
Requires a Database Management System (DBMS
13
Database Management System
• A software system that is used to create, maintain, and provide
controlled access to user databases
Order Filing
System
Invoicing Central database
DBMS
System
Contains employee,
order, inventory,
pricing, and
Payroll
customer data
System
DBMS manages data resources like an operating system manages
hardware resources
14
Database Management System
Database Management System
Components of the Database Environment
17
Components of the Database Environment
• CASE Tools–computer-aided software engineering
• Repository–centralized storehouse of metadata
• Database Management System (DBMS) –software for
managing the database
• Database–storehouse of the data
• Application Programs–software using the data
• User Interface–text and graphical displays to users
• Data/Database Administrators–personnel responsible
for maintaining the database
• System Developers–personnel responsible for
designing databases and software
• End Users–people who use the applications and
databases
18
Workgroup database with wireless
local area network
19
SQL Overview
• Structured Query Language
• The standard for relational database
management systems (RDBMS)
• A relational database is a type of database
that organizes data into tables made up of
rows and columns, where data points are
related to each other.
20
History of SQL
• 1970–E. Codd develops relational database
concept
• 1974-1979–System R with Sequel (later SQL)
created at IBM Research Lab
• 1979–Oracle markets first relational DB with
SQL
• 1986–ANSI SQL standard released
• 1989, 1992, 1999, 2003–Major ANSI standard
updates
• Current–SQL is supported by most major
database vendors 21
SQL Environment
• Data Definition Language (DDL)
– Commands that define a database, including creating,
altering, and dropping tables and establishing constraints
• Data Manipulation Language (DML)
– Commands that maintain and query a database
• Data Control Language (DCL)
– Commands that control a database, including
administering privileges
22
DDL, DML, DCL, and the database development process
23
Big Data
• Large amount of data are collected and stored everyday
– Can come from different sources, huge amounts, large update rates
• Examples: facebook needs to handle: 2.7 billion “likes”, 400 million
images, 500+ TB per day!!, Google receives more than 1 billion
queries per day!
• Question: How to utilize these datasets in order to help us on our
goals:
– Data Analytics: Try to analyze the data in order to find useful, unknown
and actionable information in the data