Unit 4.3.
0
Candidates should have an
understanding of how organizations use
ICT, including
sequential file systems (batch processing e.g.
payroll);
Indexed sequential & random access files
(e.g. . payroll and personnel records.
Relational database systems (e.g. customer
database linked to sales records)
Youshould be able to describe these
systems, giving the hardware and
Datais the lifeblood of most
businesses and organizations. Why
do they collect and store data?
Because data is processed (sorted,
filtered, searched) to give us
A database is a collection of data
that is stored in an organized or
logical manner so that data can be
processed effectively or retrieved
quickly and efficiently.
You should recall the following from
GCSE:
Tables
Fields
Records
Some databases exist
solely to process data
automatically – for
example databases held by
utility companies
Some databases exist to
give us information when
we need it: for example,
the school database.
The purpose of the
database obviously affects
the way that data is stored,
Different Which phone
Person? No is right? Inconsistent?
A Flat File is one data file containing a two dimensional table. It
generally contains data duplication as each record is self contained.
Q. How many records are in this flat file?
A record is one row of a table and contains all the data related to a
particular person or thing e.g. a loan record for Joe Smith
Q. How many fields are there in this flat file?
A field is one column of a table and contains one piece of data about a
person or thing e.g BorrowerName
Why does the above flat file store data inefficiently?
A flat file may contain data duplication where the same data item is stored
in two or more different locations.
Unnecessary duplication is known as data redundancy.
Redundancy often leads to inconsistency where the same data item is stored
differently in different places. Eg BorrPhone 454545,454555 due to typing
errors or being updated in one place but not another.
Flat files can be turned into more efficient related tables through a process
of normalisation and put into a database
•Take out any repeating groups of data and put them in a separate table.
•Make sure one field is present in both tables to form a link or relationship
between them.
•Generally this field is unique to one of the tables (its primary key).
•Separate tables only tend to contain data about one kind of thing.
Question: Identify the repeating groups in the flat file above.
•It contains 2 repeating groups, what are they?
•These details can be taken out into separate tables. What fields
should be left behind in the original table?
Original Loan Flat File
BORROWER TABLE LOAN TABLE (Original)
BOOK TABLE
Relational Diagram
NB. Need to add borrowerID to
give the borrower table a
unique key.
Problems with the traditional file approach:-
•Data redundancy - same data duplicated in many different files
•Data inconsistency - when the same items of data are held in
several different files, the data should be updated in each file
when it changes (if not -> data inconsistency)
•Program-data dependence - file format (i.e. which data fields
constitute a record) must be specified in each program.
Changes to the format of the data fields mean that every file
which uses that program has to be changed.
•Lack of flexibility - for non-routine data it could take weeks to
assemble data from various files and write new programs to
produce the required reports
•Non-sharable data - if two departments need the same data,
either a second copy of the data would be made (-> data
inconsistency) or the same file used (adding extra fields would
mean programs would need to be changed to reflect the new
file structure)
A relational database consists of a number
of separate tables
For example a payroll table and a staff table
Tables are linked to each other…
… using a key field
For example the employee ID
This field is part of other table(s)
Data from one table combined with data
from other table(s) when producing reports.
Can select different fields from each table
for output
SQL is used for queries and producing
1. Tables are designed to reduce duplicate data to a minimum and therefore
remove any redundant data
2. No redundant data means data only has to be input once ensuring faster
data entry and consistency of data
3. Changes in the structure of the database do not affect programming that
accesses other parts of the database. This is called Data Independence
from the program. Eg Adding a new field called Gender to the Borrower
table doesn’t mean you have to reprogram the Loans Report. (You would
have to do this on a flat file)
4. Data Pool can be accessed by several different applications
5. Information held more than once (Key fields acting as links between
tables) are automatically updated by the system
6. Increased productivity as users can use report generators to customise
reports to meet particular needs.
7. Different access rights available for different parts of database
1. As all data for a range of applications is held
in one place there are greater security and
confidentiality issues.
• Eg Many users need to view and update
different combinations of tables or records or
fields in a database.
• Eg Very important to back up this data as all
data will be lost if a natural disaster occurs.
2. Backup and restoration processes are more
complex for databases than flat files
A relationship is a link or association between
entities
The links (relationships) may be...
one-to-one
Products and bar-codes in a supermarket.
one-to-many
One video club member may loan a number
of videos.
many-to-many
Pupils and Teachers in a school.
Entity-relationship diagram - diagrammatic way of representing
the relationship between entities in a database.
An entity-relationship diagram shows the links between tables.
One-to-one
E.g. Products in a supermarket each have a
unique barcode number.
One-to-many E.g. A video club member may hire out a
number of videos.
Many-to-many Teachers and pupils in a school. Each teacher
teaches many pupils and each pupil has many
teachers.
DBMS
The DBMS (Database Management System) is a
program which allows the user access to data.
It must...
allow users to create and edit the data
and provide facilities to search the data
using a query language.
allow other applications to use the
data.
create and maintain the data
dictionary
maintain the integrity of the database. On a
multi-access system, this is done by locking
a record or table when a user is editing it.
This means that another user is unable to
edit it at the same time. When the data is
saved it is unlocked.
check passwords of individual users and
only allow that user access to certain parts of
the database.
ensure that recovery is possible if the
database is corrupted.
There are four types of file
organization that you need to know
about:
Serial
Sequential
Indexed Sequential
Direct /Random Access
A serial file is one in which the
records have been stored in the
order in which they have arisen.
They have not been sorted into
any particular order.
Ashopping list is an example
of a non-
computerised serial file.
▪ A collection of records
Anexample of a serial file is an
unsorted transaction file (more on
this in a minute ).
Cannot be used as master
Used as temporary transaction file
Records stored in the order received
A sequential file is one in which the
records are stored in sorted order of
one or more key fields.
Sequential access means that data is
accessed in a predetermined, ordered
sequence.
Sequential access is sometimes the only
way of accessing the data, for example if
it is on a tape.
It may also be the access
method we need to use if the
application requires processing
a sequence of data elements in
order.
Records are usually stored on
tape and processed one after
the other – for example when
utility companies issue bills, or
when businesses produce pay
slips for their workers at the end
of each month.
A collection of records
Stored in key sequence
Adding/deleting record requires
making new file (so that the
sequence is maintained)
Used as master files
Serial files are often used as
transaction files.
Sequential files are used as master
files.
A company’s master file might hold all
the data about every employee
A transaction file might hold a list of all the
employees who have gotten married this
month and changed their names.
The master file would be read one
record at a time
The transaction file would be used to
update the master file
Federer
r
Potte
Windsor
Cole
Hermione Grang
Britney Spears
Cheryl Tweedy
Kate Middleton
Simple file design
Very efficient when most of the
records must be processed e.g. Payroll
Very efficient if the data has a natural
order
Can be stored on inexpensive devices
like magnetic tape.
Entire file must be processed even if a
single record is to be searched.
Transactions have to be sorted before
processing
Overall processing is slow, because you
have to go through each record until you
get to the one you want!
Each record of a file has a key field
which uniquely identifies that record.
An index consists of keys and
addresses, just like an index in a
book:
The pages in a book are stored
sequentially, so you can read through it
page by page
OR
You can look up the page you want
in the index and flick straight to it
An indexed sequential file is a
sequential file (i.e. sorted into order
of a key field) which has an index.
A full index to a file is one in which
there is an entry for every record.
Because each record has an index, we
can access individual records directly,
without having to scroll through all the
other records first.
Indexed sequential files are
important for applications where
data needs to be accessed.....
sequentially , one record after another
OR
randomly using the index.
A company may store details about its
employees as an indexed sequential file.
Sometimes the file is accessed....
sequentially. For example when the
whole of the file is processed to produce
pay slips at the end of the month.
Sometimes the file is accessed....
randomly. Maybe an employee
changes address, or a female employee
gets married and changes her surname.
An indexed sequential file can only be stored
on a random access device e.g. magnetic disc
or CD.
This is because we need a device that will allow
us direct access to random files, rather than
the sequential access that magnetic tape
allows.
Provides flexibility for users who
need both type of access with the
same file
Faster than sequential
Extrastorage space for the index is
required, just like in a book: your text
book would be 372 pages without
the index (go on, check!) but is 380
pages with the index.
Records are read directly from or
written on to the file.
The records are stored at known
address.
The address is calculated by applying
a mathematical function to the key
field.
A random file would have to be stored
on a direct access backing storage
medium e.g. magnetic disc, CD, DVD
Example : Any information retrieval
system. Eg Train timetable system.
Any record can be directly accessed.
Speed of record processing is very
fast.
Up-to-date file because of online
updating.
Concurrent processing is possible.
More complex than sequential
Does not fully use memory
locations
More security and backup problems