0% found this document useful (0 votes)
6 views36 pages

Lecture Note 317

The document provides an overview of Geographic Information Systems (GIS), detailing its definition, components, advantages, and applications across various fields such as natural resources, infrastructure, and socio-economic analysis. It emphasizes the integration of technology, data, and personnel in GIS, highlighting its role in decision-making and problem-solving related to geographic data. Additionally, it discusses the relationship between GIS and other technologies like CAD and BIM, and the importance of understanding spatial scales, intents, and time factors in real-world applications.

Uploaded by

Abdullai Lateef
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views36 pages

Lecture Note 317

The document provides an overview of Geographic Information Systems (GIS), detailing its definition, components, advantages, and applications across various fields such as natural resources, infrastructure, and socio-economic analysis. It emphasizes the integration of technology, data, and personnel in GIS, highlighting its role in decision-making and problem-solving related to geographic data. Additionally, it discusses the relationship between GIS and other technologies like CAD and BIM, and the importance of understanding spatial scales, intents, and time factors in real-world applications.

Uploaded by

Abdullai Lateef
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Principles of GIS

Lecture Note on SGI 317

Surv. Okoli, F. U

Department of Surveying & Geoinformatics


2025
What do I do with a GIS?
You are now beginning the study of Geographic information science. This discipline of study is centred
around the fundamentals and applications of geographic information systems or GIS for short.

So, what is GIS? What can they do? To give you some idea, consider an example in natural resources
management. Assume that you have been given the following tasks for a particular region (ie. local
government area, state, country, etc.)

• Inventory available forest and mineral resources.


• Obtain flora and fauna requirements.
• Determine water availability and quality.
• Examine extent of disease (ie. dieback).
• Which resources are protected or in short supply (ie. national heritage listing)?
• Evaluate how resources are currently being exploited.
• Predict how availability and quality of these resources will change in the next 10, 20 or
even 100 years.
• Assess conflicts with environment, quality of life, populated areas, visual impact, etc.
• Comply with local, regional and national regulations and legislation.

The more you think about it, the more complex it becomes. Just imagine what you may need: lots (I mean
lots!) of data, access to a range of departments and agencies, various software and hardware, many
personnel, etc. Well...it can be done - you guessed it - using GIS!

What is a geographic information system?

What is a Geographic Information System?

An information system applied to geographic data.

System:

A group of connected entities and activities which interact


for a common purpose.

For a GIS, the "connected" refers to geography, and the "common


purpose" is managing or planning or decision-making.

Information system attributes (which also apply to GIS):

• decision-oriented reporting
• effective processing of data
• effective management of data
• adequate flexibility
• a satisfying user environment

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


How do we formally define a GIS? No one definition exists since there are many different contexts in
which GIS exists. A definition of GIS can be seen from a number of points of view. The definition that
we will use in this course takes into account the various components necessary for the successful
establishment of any GIS:

• Technology (hardware and software)


• People
• Data

Geographic Information System:

"An organised collection of computer hardware, software, geographic data, and personnel designed to
efficiently capture, store, update, manipulate, analyse, and display all forms of geographically referenced
data."

GIS as an information system


GIS are one of many different types of information systems. The traditional Management Information
Systems and Decision Support Systems do not cater for spatial information. There are, however, spatial
information systems that are not geographic, such as Computer Aided Design/Computer Aided
Manufacturing (CAD/CAM) systems which do not handle a "geographic" component.

Other terms for GIS

Spatial information system Multipurpose geographic data system


Spatial data handling system Multipurpose cadastre
Land resources information system Land-related information system
Planning information system Environmental information system
Spatial data management system Land information system
Geo-information system Geomatics
Natural resources information system Geoscience information system
Spatial data analysis system
Automated mapping and facilities management AM/FM -

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Advantages of GIS

The advantages of GIS are many and relate to the fact that GIS is an integrating technology - one that
brings together many different applications, data and users. One word that can be used to describe the
benefit of GIS is synergy. In particular, the following can be sited as advantages of GIS:

• Integrates spatial and other (aspatial) data across a diverse range of applications
• Identifies connections between activities based on geographic proximity
• Manipulate and display geographic knowledge
• Provides access to administrative records
• A tool for enhancing decision making
• Increases ability to model science and management problems
• A catalyst to further development

Areas of application of GIS technology

The applications of GIS technology can be categorised into four broad areas:

Natural resources Land parcel-based


wildlife habitat
wild and scenic rivers • zoning - urban and regional
recreation resources • subdivision planning and review
floodplains • environmental impact assessment
wetlands • water quality management
agricultural lands • maintenance of land ownership
aquifers • land valuation and taxation
forests • town planning schemes
minerals and exploration
oil and gas

Infrastructure

• transport route planning


• street address matching Socio-economic
• location analysis, site selection
• population distribution and forecasting
• disaster planning and evacuation usage
• demographic marketing and analysis
and planning of roads, sewer and water
• monitoring of patient health
reticulation, drainage, telephone lines,
• epidemiology
gas and electricity, etc..
• police crime statistics and monitoring
• census information public services and
access

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


GIS-Related Disciplines

GIS have developed over time across a wide range of disciplines. As a matter of fact, the whole
foundational concept of GIS is multi-disciplinary.

Disciplines involved:
Computer science
Remote sensing
Cartography
Statistics,
Geodesy
Photogrammetry
Surveying
Geography
Geosciences - geology, geophysics, minerals and petroleum, etc.
Mathematics: geometry, graph theory
Information systems
Urban and regional planning etc.....

GIS relationship to CAD and BIM


CAD, BIM, and GIS are distinct but interconnected technologies used in design, construction, and
management of infrastructure. CAD (Computer-Aided Design) focuses on creating detailed drawings and
models, BIM (Building Information Modelling) extends CAD by adding building information, and GIS
(Geographic Information System) provides spatial context and analysis capabilities. Integrating these
technologies allows for more efficient and informed decision-making throughout the project lifecycle.

• CAD:
Primarily used for creating and modifying 2D and 3D drawings and models, CAD software is
essential for the design and planning of various projects.
• BIM:
Building Information Modelling goes beyond CAD by creating digital representations of physical
and functional characteristics of a facility. It allows for collaboration, analysis, and simulation of
design, construction, and operational phases.
• GIS:
Geographic Information Systems provide a framework for managing and analysing spatial data,
including maps, imagery, and other geospatial information. GIS enables integration of project
data with its location and surroundings, facilitating spatial analysis and decision-making.

Integration Benefits:
• Integrating CAD, BIM, and GIS allows for:
• Improved design coordination and collaboration between stakeholders.
• Enhanced spatial analysis and visualization for better decision-making.
• Streamlined workflows and reduced errors throughout the project lifecycle.
• Creation of a "digital twin" of a physical structure for improved design, construction, and
operations.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Components of GIS
GIS is made up of (5) major components namely:
I. Hardware components
It consists of the computer system on which the GIS software will run. The choice of hardware system
ranges from Personal Computers to multi user Super Computers. These computers should have essentially
an efficient processor to run the software and sufficient memory to store enough information (data).
Looking at GIS broadly, it can be subdivided into three subsystems.

a. Data acquisition hardware:


These include the modern/automated surveying instrument such as Total station, Scan station,
DGPS, Electronic theodolites, Digital levels.
- Analytical & Digital photographic plotters equipment.
- Digital Image processing equipment
b. Data storage, Manipulation & Retrieval hardware
This includes the Host computer which may be PC, workstation, or mainframe etc

c. Information presentation Hardware It includes Monitors, printers, plotters etc

II. SOFTWARE Component


GIS software provides the functions and tools needed to store, analyze, and display geographic
information. The software available can be said to be application specific.
All GIS software generally fit all these requirements, but their on-screen appearance (user
interface) may be different. Software specifically designed for geographical data processing is capable of
managing vector, raster or hybrid data.
The main software components of a GIS are
➢ the Database Management System (DBMS);
➢ the basic functions, managed through an opportune user interface, are dedicated to:
– data input and validation;
– data storage and database management;
– data analysis and processing;
– data output.
The Structured Query Language (SQL) controls the basic functions. SQL database is a computer language
designed for the retrieval and management of data in Relational Database Management Systems
(RDBMS), database schema creation and modification and database access management.

III. Data (Spatial Database)


Geographic data and related tabular data are the backbone of GIS. It can be collected in-house or
purchased from a commercial data provider. The digital map forms the basic data input for GIS.
Tabular data related to the map objects can also be attached to the digital data. A GIS will integrate
spatial data with other data resources and can even use a DBMS. Data are essential for a GIS; the
quality of every reproduced result depends on the data availability, accuracy and homogeneity. The
cost of the data collection step exceeds both the software and the hardware costs, and it is estimated
as 70% of the total cost needed to obtain the output final product.
Thus, it is essential that each data file entering a GIS is equipped with information declaring its quality.
This is called metadata and must communicate the origin, the reliability, the precision, the
completeness, the consistency and the updating of the data it refers to. To make the data available and

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


useful to the highest number of users, they have to correspond to a pre-defined standard, or data
transfer standard.
The working steps while using an already structured GIS are:
➢ data acquisition: data sources can be ground surveys devoted to the collection of continuous or
discontinuous punctual data, aerial and satellite images, ground and/or aerial Laser Scanning
Systems (LSS), existing hardcopy maps, socio-economic databases, statistical databases, etc.;
➢ data input: data must be transformed into an appropriate format that can be managed by the system.
The geometrical data input can come from digitization of hardcopy maps, by photogrammetric
stereo-plotting, by scanning systems, etc.; the alpha-numerical data input comes from keyboard
typing or table computations
➢ pre-processing: during this phase, the data required for the selected application are made suitable
to be managed by the software. For each process, it is important to perform rigorous quality tests
on both alpha-numerical and geographical data, in order to qualify the obtained results.
➢ data management: The Database Management Systems (DBMS) are the informatics tools in charge
of this task. Management regards both the graphical (spatial) and the alphanumerical (non-spatial)
data. Sometimes graphical data are separately managed from alpha-numerical ones due to their
different characteristics;
➢ data presentation: Results produced inside a GIS are commonly represented through layouts
showing thematic maps whose degree of detail depends on the density of information, which is on
the scale of the layout itself. Thematic maps are an effective tool to integrate in a single drawing
both the geometry of the features and some of their attributes.

IV. Methods/Procedure
Method: A successful GIS operates according to a well-designed plan, which are the models and operating
practices unique to each task. There are various techniques used for map creation and further usage for
any project. The map creation can either be automated raster to vector creator or it can be manually
vectored using the scanned images. The source of these digital maps can be either map prepared by any
survey agency or satellite imagery.

V. People or Expertise
GIS users range from technical specialists who design and maintain the system to those who use it to help
them perform their everyday work. GIS operators solve real time spatial problems. They plan, implement
and operate to draw conclusions for decision making.

VI. Network
Networks in GIS can be represented using different data structures, such as vector-based or graph-based
representations. Vector-based networks use lines or polylines to represent edges and points for nodes,
while graph-based networks use nodes and edges to form a graph structure. Graph-based representations
are often more flexible and efficient for network analysis tasks.
GIS software provides tools and functionalities to build, manage, and analyze networks. These tools allow
users to define network connectivity rules, assign attributes to nodes and edges, calculate distances and
routes, perform network-based queries, and visualize network relationships.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Solving Real World Problems

Almost everything that happens, happens somewhere. Largely, we humans are confined in our activities
to the surface and near-surface of the Earth. Human activities revolve within the earth surface which can
be above, on or beneath the surface of the earth. Keeping track of all of this activity is important, and
knowing where it occurs can be the most convenient basis for tracking. Knowing where something
happens is of critical importance if we want to go there ourselves or send someone there, to find other
information about the same place, or to inform people who live nearby. In addition, most (perhaps all)
decisions have geographic consequences, e.g., adopting a particular funding formula creates geographic
winners and losers, especially when the process entails zero sum gains. Therefore, geographic location is
an important attribute of activities, policies, strategies, and plans. Geographic information systems are a
special class of information systems that keep track not only of events, activities, and things, but also of
where these events, activities, and things happen or exist.
Since the real-world problems are geographic in nature, how do we distinguish between one geographic
problem to another. There are several ways to that but we will focus on three major categories.

1. Spatial scale:
The engineers design of a building can presents geographic problems, as in disaster management, but only
at a very detailed or local scale. The information needed to service the building is also local – the size and
shape of the parcel, the vertical and subterranean extent of the building, the slope of the land, and its
accessibility using normal and emergency infrastructure. The global diffusion of the 2019 COVID, or of
bird flu in 2004 were problems at a much broader and coarser scale, involving information about entire
national populations and global transport patterns.

2. Spatial intent or purpose


Some problems are strictly practical in nature – they must often be solved as quickly as possible and/or at
minimum cost, in order to achieve such practical objectives as saving money, avoiding fines by regulators,
or coping with an emergency. Others are better characterized as driven by human curiosity. When
geographic data are used to verify the theory of continental drift, or to map distributions of glacial deposits,
or to analyze the historic movements of people in anthropological or archaeological research. there is no
sense of an immediate problem that needs to be solved – rather, the intent is the advancement of human
understanding of the world, which we often recognize as the intent of science.

3. Spatial time factor or time scale


Some decisions are operational, and are required for the smooth functioning of an organization, such as
how to control electricity inputs into grids that experience daily surges and troughs in usage. Others are
tactical, and concerned with medium-term decisions, such as where to cut trees in next year’s forest
harvesting plan. Others are strategic, and are required to give an organization long-term direction, as when
retailers decide to expand or rationalize their store networks. Geographic databases are often transactional
meaning that they are constantly being updated as new information arrives, unlike maps, which stay the
same once printed.

The complexity of the real world, as well as the broad spectrum of its interpretations, suggests that GIS
system designs will vary according to the capabilities and preferences of their creators. This human factor
can introduce an element of constraint, as data compiled for a particular application may be less useful
elsewhere. Using GIS to solve problems in the real world requires interaction between the real world, the
GIS and the users. The real-world problems can be described only in terms of models that delineate the
concepts and procedures needed to translate real world observations into data that are meaningful in GIS.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


The process of interpreting reality by using both a real world and a data model is called data modelling.
Using GIS to solve problems in the real world requires interaction between the real world, the GIS and
the users.

The real-world problem needs to be represented within a GIS. The users perceived the real world in a
manner related to their problem, and hence need to be able to communicate with the GIS in terms related
to their problem (ie. data, functionality, etc.) modelling the real problem into GIS environment. GIS solves
the problem of the real world which been modelled and give a represented of the problem in various forms
like maps for decision making.

How do we represent the real world?

Geographic features in the real world can be represented in a number of ways as follows:

1. Analog map

• The traditional analogue map has been in use for centuries!


• Divided into physical map sheets
• Based on the communication paradigm - emphasis is on visual communication.

2. Digital map

• Maps are stored in digital form on computers to create a cartographic database


• Still based on the "analogue" map concept
• Has greatly enhanced the map-making process and the production of various types of maps.

3. GIS

• A geographic database involves much more than a cartographic database (ie. much more than
simple a map or maps)
• The emphasis is on the structure and management of data and their relationships
• Based on the analytical paradigm - focus is on analysis
• The concepts of GIS extend far beyond the map!

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Abstraction and Generalization

The process for obtaining a representation of the real world follows the cartographic process for
abstraction and generalization. The process involves the steps of selection, classification, simplification
and symbolization.

The process for obtaining a GIS representation must consider the purpose, content and detail of the
database. This is similar to the cartographic map-making process in which the purpose, content,
cartographic scale and presentation must be considered in producing a map.

Steps of the generalization process

The steps of the process of abstraction and generalization are described as follows:

• Selection. Involves decisions regarding the geographic space to be mapped, map scale, map
coordinates and projection, data variables to be mapped, data gathering/sampling techniques.

• Classification. Process in which objects are placed in groups according to similar properties.
This reduces the complexity and improves the organization of a map.

• Simplification. Map features can be simplified by smoothing curves and straightening paths
to eliminate unnecessary detail. For example, a straight line between two cities could indicate
the connectivity between cities rather than the exact positional location of a road which may
be irrelevant for a particular application.

• Symbolization. A set of marks or symbols is used to represent real world phenomena on a


map. Such symbolization involves defining size, shape, pattern, and color for points, lines,
and polygons (areas).

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Data representation with GIS

In many ways, GIS have retained the notion of the map and many map concepts are found back in GIS.
However, the manner in which GIS handle and analyse data is very different from that for maps. This is
despite the fact that much data input into GIS is derived from maps.

Within GIS, data is often structured in a layered fashion representing the way in which maps have
traditionally been handled. Each layer, also known as a coverage, contains some specific data such as a
theme (eg. roads, vegetation cover, soils, etc.), time period (eg. years 1970, 1980, 1990) or vertical slices
(eg. ground floor, first floor, etc. of a building).

Geographic data includes both


spatial data and descriptive (or
attribute) data. Spatial data
deals with location, shape and
relationships among features.
Attribute data deals with the
characteristics of the features.

Essential GIS components

Every GIS must include:

• data
• functionality, and
• a user interface.

The database is the heart of the GIS. It must be structured so that the data can be accessed by functions
initiated by users. In the following sections, we will consider the structure of the data as well as the
functions that operate on the data.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Structure of geographic data

The following chart illustrates the structure of geographic data.

The spatial component consists of locational information (ie. absolute or relative X, Y coordinates),
geometry (ie. shape of point, line and polygon features [or raster cells)) and topology (ie. relationships
between points, lines and polygons - adjacency, connectivity, and containment). Attribute data can consist
of both descriptive data and cartographic attributes (eg. line color and thickness, point symbol, etc.). A
third component is temporal data which is sometimes considered as a further dimension (eg. fourth
dimension) but is often included as another attribute of the data. Never forget Metadata.

Model of Space (Spatial modelling)

WHAT
Phenomenon

WHERE WHEN Space Time

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


The basic philosophy in the above modelling diagram is that the basic units are dependent on their
relationships. The assumption based on the geographic enclosure is that one is constant (time), the other
is predefined while the third is the unknown.

In relation to our geographic location, the ‘what’ becomes the object (phenomenon), the ‘where’ becomes
space (location) and the when becomes time (Date). The predefined and measured one depend on the
view which is based on field based and Object based models.

Object-Based Model:

The object is a spatial feature and has some characteristics like spatial boundary, application relevant and
feature description (attributes). Spatial objects represent discrete features with well-defined or identifiable
boundaries, for example, buildings, parks, forest lands, geomorphological boundaries, soil types, etc. In
this model, data can be obtained by field surveying methods (chain-tape, theodolite and total station
surveying, GPS/DGPS survey) or laboratory methods (aerial photo interpretation, remote sensing image
analysis and onscreen digitization). Depending on the nature of the spatial objects we may represent them
as graphical elements of points, lines and polygons.

Field-Based Model:

Spatial phenomena are real world features that vary continuously over space with no specific boundary.
Data for spatial phenomena may be organized as fields which are obtained by direct or indirect sources.
Source of direct data is from aerial photos, remote sensing imagery, scanning of hard copy maps, and field
investigations made at selected sample locations. We can obtain or generate the data by using
mathematical functions such as interpolation, sampling or reclassification from selected sample locations.
This approach comes under indirect data source. For example, Digital Elevation Model (DEM) can be
generated from topographic data such as spot heights and contours that are usually obtained by indirect
measurements. Spatial database may be organized as either object-based model or the field-based model.
In object-based databases, the spatial units are discrete objects which can be obtained from field-based
data by means of object recognition and mathematical interpolation. In the object-based model, spatial
data is mostly represented in the form of coordinate’s lists (i.e. vector lines) and generally called as the
vector data model. When a spatial phenomenon database is structured on the field-based model in the
form of grid of square or rectangular cells then the representation is generally called as the raster data
model. Geospatial database possesses two distinct components such as locations and attributes.
Geographical features in the real world are very difficult to capture and may requires large scale database.
GIS can organize reality through the data models. Each model tends to fit certain types of data and
applications better than others. All spatial data models fall into two basic categories: raster and vector.

GIS Reference Systems


Reference systems in GIS integrate the concepts of geodesy, coordinate systems, datums, and map
projections. Every GIS dataset inherently includes a spatial reference system. This system may be
a simple, arbitrary framework—such as a 10-meter by 10-meter sampling grid within a wooded
area or the defined boundaries of a soccer field. Alternatively, it may be a geographic reference
system, where spatial features are tied to an Earth-based coordinate framework. The emphasis of

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


this topic is on Earth-referenced systems, which are typically based on either a Geographic
Coordinate System (GCS) or a Projected Coordinate System (PCS).

Geographic Coordinate Systems


A geographic coordinate system is a reference system for identifying locations on the curved
surface of the earth. Locations on the earth’s surface are measured in angular units from the centre
of the earth relative to two planes: the plane defined by the equator and the plane defined by the
prime meridian (which crosses Greenwich England). A location is therefore defined by two
values: a latitudinal value and a longitudinal value. A latitude measures the angle from the
equatorial plane to the location on the earth’s surface. A longitude measures the angle between
the prime meridian plane and the north-south plane that intersects the location of interest.

Fig.
Sphere and Ellipsoid
Assuming that the earth is a perfect sphere greatly simplifies mathematical calculations and works
well for small-scale maps (maps that show a large area of the earth). However, when working at
larger scales, an ellipsoid representation of earth may be desired if accurate measurements are
needed. An ellipsoid is defined by two radii: the semi-major axis (the equatorial radius) and the
semi-minor axis (the polar radius).
The reason the earth has a slightly ellipsoidal shape has to do with its rotation which induces a
centripetal force along the equator. This results in an equatorial axis that is roughly 21 km longer
than the polar axis.

Geoid
Representing the earth’s true shape, the geoid, as a mathematical model is crucial for a GIS
environment. However, the earth’s shape is not a perfectly smooth surface. It has undulations
resulting from changes in gravitational pull across its surface. These undulations may not be
visible with the naked eye, but they are measurable and can influence locational measurements.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Note that we are not including mountains and ocean bottoms in our discussion, instead we are
focusing solely on the earth’s gravitational potential which can be best visualized by imagining
the earth’s surface completely immersed in water and measuring the distance from the earth’s
center to the water surface over the entire earth surface.
The earth’s gravitational field is dynamic and is tied to the flow of the earth’s hot and fluid core.
Hence its geoid is constantly changing, albeit at a large temporal scale. The measurement and
representation of the earth’s shape is at the heart of geodesy–a branch of applied mathematics.

Fig Geoid

Datums
Datums provide a reference surface for measuring positions on the Earth and are the basis for
coordinate systems. So how are we to reconcile our need to work with a (simple) mathematical
model of the earth’s shape with the undulating nature of the earth’s surface (i.e. its geoid)? The
solution is to align the geoid with the ellipsoid (or sphere) representation of the earth and to map
the earth’s surface features onto this ellipsoid/sphere. The alignment can be local where the
ellipsoid surface is closely fit to the geoid at a particular location on the earth’s surface (such as
the state of Kansas) or geocentric where the ellipsoid is aligned with the centre of the earth. How
one chooses to align the ellipsoid to the geoid defines a datum.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Local Datum
A local datum is a geodetic reference system that is optimized for a specific region, with its origin
typically located at a known point on the Earth's surface within that region. It provides high
positional accuracy in the local area but may not align well with global reference systems like
WGS 84.
A geocentric datum is a type of geodetic reference system in which the origin of the coordinate
system is located at the Earth's centre of mass. This means that the coordinates are defined relative
to a point that closely represents the Earth's centre as determined by satellite measurements, rather
than a local point on the Earth's surface.
Geocentric datums are global in nature and are designed to provide accurate positioning
worldwide. One of the most widely used geocentric datums is the World Geodetic System 1984
(WGS 84), which underpins the Global Positioning System (GPS).
Key features of a geocentric datum:
i. Origin at the Earth's centre of mass (including oceans, atmosphere, and continents)
ii. Compatible with satellite-based positioning systems
iii. Provides consistent global positioning across regions

Local Datum in Nigeria


1. Minna Datum (also known as Nigeria 1962 or Minna 1962)
Origin Point: Minna, Nigeria (Latitude 9° 38' N, Longitude 6° 30' E)
Reference Ellipsoid: Clarke 1880 (modified)
Datum Type: Local
Use: Traditionally used for national mapping and geodetic control in Nigeria. The Minna datum
defines the shape of the Earth using the Clarke 1880 ellipsoid, and it is georeferenced to a physical
point in Minna, central Nigeria. Because of its local nature, it aligns well with Nigeria’s surface
features but does not match GPS coordinates unless transformed.
2. Nigeria Transverse Mercator (NTM) Projection
Often used in conjunction with the Minna Datum for topographic and cadastral mapping. The
projection divides Nigeria into multiple zones to minimize distortion across large extents.

Projected Coordinate Systems


The surface of the earth is curved but maps are flat. A projected coordinate system (PCS) is a
reference system for identifying locations and measuring features on a flat (map) surface. It
consists of lines that intersect at right angles, forming a grid. Projected coordinate systems (which
are based on Cartesian coordinates) have an origin, an x axis, a y axis, and a linear unit of measure.
Going from a GCS to a PCS requires mathematical transformations. The myriads of projection
types can be aggregated into three groups: planar, cylindrical and conical.
Planar Projections

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


A planar projection (aka Azimuthal projection) maps the earth surface features to a flat surface
that touches the earth’s surface at a point (tangent case), or along a line of tangency (a secant
case).

This projection is often used in mapping polar regions but can be used for any location on the
earth’s surface (in which case they are called oblique planar projections).

Cylindrical Projection
A cylindrical map projection maps the earth surface onto a map rolled into a cylinder (which can
then be flattened into a plane). The cylinder can touch the surface of the earth along a single line
of tangency (a tangent case), or along two lines of tangency (a secant case).

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


The cylinder can be tangent to the equator or it can be oblique. A special case is the Transverse
aspect which is tangent to lines of longitude. This is a popular projection used in defining the
Universal Transverse Mercator (UTM) and State Plane coordinate systems. The UTM PCS covers
the entire globe and is a popular coordinate system in the US. It’s important to note that the UTM
PCS is broken down into zones and therefore limits its extent to these zones that are 6° wide. For
example, the State of Maine (USA) uses the UTM coordinate system (Zone 19 North) for most of
its statewide GIS maps. Most USGS quad maps are also presented in a UTM coordinate system.
Popular datums tied to the UTM coordinate system in the US include NAD27 and NAD83. There
is also a WGS84 based UTM coordinate system.
Distortion is minimized along the tangent or secant lines and increases as the distance from these
lines increases.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Conical Projection
A conical map projection maps the earth surface onto a map rolled into a cone. Like the cylindrical
projection, the cone can touch the surface of the earth along a single line of tangency (a tangent
case),

or along two lines of tangency (a secant case).

Distortion is minimized along the tangent or secant lines and increases as the distance from these
lines increases. When distance or area measurements are needed for the contiguous 48 states, use
one of the conical projections such as Equidistant Conic (distance preserving) or Albers Equal
Area Conic (area preserving).
Conical projections are also popular PCS’ in European maps such as Europe Albers Equal Area
Conic and Europe Lambert Conformal Conic.

Spatial Properties
All projections distort real-world geographic features to some degree. The four spatial properties
that are subject to distortion are: shape, area, distance and direction. A map that preserves shape
is called conformal; one that preserves area is called equal-area; one that preserves distance is

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


called equidistant; and one that preserves direction is called azimuthal. Each map projection is
good at preserving only one or two of the four spatial properties. So when working with small-
scale (large area) maps and when multiple spatial properties are to be preserved, it is best to break
the analyses across different projections to minimize errors associated with spatial distortion.
If you want to assess a projection’s spatial distortion across your study region, you can generate
Tissot indicatrix (TI) ellipses. The idea is to project a small circle (i.e. small enough so that the
distortion remains relatively uniform across the circle’s extent) and to measure its distorted shape
on the projected map. For example, in assessing the type of distortion one could expect with a
Mollweide projection across the continental US, a grid of circles could be generated at regular
latitudinal and longitudinal intervals.

Data formats
In Geographic Information Systems (GIS), data formats refer to the structured ways in which
spatial and attribute data are stored, processed, and exchanged. These formats can be broadly
categorized into vector, raster, and non-spatial/tabular formats, each serving different types of
spatial analysis and visualization.

1. Vector Data Formats

Vector data represents geographic features using points, lines, and polygons.

• Shapefile (.shp, .shx, .dbf) – A popular Esri format; stores geometry and attribute data
separately.
• GeoJSON (.geojson) – A lightweight, web-friendly format using JavaScript Object
Notation (JSON).
• GPKG (GeoPackage) – An open standard based on SQLite; stores vector and raster data in
one file.
• KML/KMZ (.kml/.kmz) – Used in Google Earth for sharing geospatial data with styling
and metadata.
• File Geodatabase (.gdb) – Esri’s modern format that supports advanced data management
and large datasets.

2. Raster Data Formats

Raster data is used to represent continuous surfaces like elevation, temperature, or satellite
imagery.

• TIFF/GeoTIFF (.tif) – Stores raster images with embedded spatial reference information.
• JPEG (.jpg) – Compressed raster image; often used for backgrounds or base maps (less
accurate).
Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U
• IMG (.img) – A format used by ERDAS Imagine for storing remote sensing imagery.
• GRID – An Esri raster format, either in binary or ASCII version.

3. Non-Spatial (Tabular) Formats

These store attribute data or metadata, which can be linked to spatial features.

• CSV (.csv) – Comma-separated values; commonly used for attribute tables and coordinate
lists.
• XLS/XLSX (.xls/.xlsx) – Excel spreadsheet formats that may include spatial data or links
to it.
• TXT (.txt) – Plain text files with tabular or coordinate data.
• DBF (.dbf) – A database file used with shapefiles to store attribute data.

4. Web-Based Formats

These formats are optimized for sharing and visualizing data online.

• Web Feature Service (WFS) – Provides vector features over the web.
• Web Map Service (WMS) – Serves map images generated from spatial data.
• Mapbox Vector Tiles (.mvt) – Compressed vector data for fast web rendering.

Topology & Spatial Relationship


Topology and geometry are the two components of spatial data. The geometry can change without
affecting topology, and likewise the topology can change without affecting the geometry. Some
operations require geometry only, others require topology only, and still others require both
geometry and topology.

Importance of Topology
Topology requires additional data files to store the spatial relationships. This naturally raises the
question: What are the advantages of having topology built into a data set?
1. It ensures data quality and integrity.
This implies that Topology enables detection of lines that do not meet and polygons that do not
close properly.
2. Topology can enhance GIS analysis.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


3. Topological relationships between spatial features allow GIS users to perform spatial data
query. As examples, we can ask how many schools are contained within a county and which land
parcels are intersected by a fault line.

In Geographic Information Systems (GIS), spatial relationships refer to the ways in which
geographic features interact with, relate to, or are positioned relative to each other in space.
Understanding these relationships is fundamental to spatial analysis, as it allows users to model,
query, and interpret real-world phenomena based on location, proximity, and arrangement.

Why Spatial Relationships in GIS


• They help reveal patterns, trends, and associations that are not obvious in non-spatial
data.
• They support decision-making, such as urban planning, disaster response, and
environmental monitoring.
• They enhance querying, such as asking “Which buildings are within 500 meters of this
river?”

Types of spatial relationship


the spatial relationships that can exist between objects derived from topological invariants of
intersections of boundary and interior are as follows.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Shapefile as a Non-Topological Data Format
A shapefile is one of the most widely used vector data formats in GIS, originally developed by
ESRI. It is commonly referred to as a non-topological data format. This means that it does not
explicitly store or enforce topological relationships between spatial features such as connectivity,
adjacency, or containment.
Although the shapefile treats a point as a pair of x-, y-coordinates, a line as a series of points, and
a polygon as a series of line segments, no files describe the spatial relationships among these
geometric objects. Shapefile polygons actually have duplicate arcs for the shared boundaries and
can overlap one another. The geometry of a shapefile is stored in two basic files:
The .shp file stores the feature geometry, and the .shx file maintains the spatial index of the feature
geometry. Non-topological data such as shapefiles have two main advantages.
Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U
First, they can display more rapidly on the computer monitor than topology-based data (Theobald
2001). This advantage is particularly important for people who use, rather than produce, GIS data.
Non-topological formats store each feature (point, line, or polygon) as an independent entity
without maintaining complex relationships (like adjacency, connectivity, or containment)
between features. This simplifies the data structure and reduces processing time when displaying
or refreshing maps. Since the software does not need to compute or check topological rules (e.g.,
shared boundaries or network connections), the map can be drawn more quickly.
In contrast, topological models (used in some advanced GIS formats like coverage data in older
ArcInfo systems) store and enforce relationships between spatial features. While useful for certain
analyses (like network routing or ensuring polygons don’t overlap), this additional structure can
slow down display because the software must manage and verify these relationships before
rendering.
Second, they are non-proprietary and interoperable, meaning that they can be used across different
software packages (e.g., MapInfo can use shapefiles, and ArcGIS can use MapInfo Interchange
Format files). Data formats are open or widely supported, making them compatible with different
GIS software platforms. Formats like ESRI Shapefile (.shp) or MapInfo Interchange Format
(.mif/.mid) are designed to be non-proprietary, meaning their structure is publicly documented
and not restricted to use in only one company's software. As a result, they are interoperable, which
means:
• A shapefile created in ArcGIS can be opened and used in QGIS, MapInfo, Global
Mapper, and other GIS tools.
• Similarly, a MapInfo Interchange Format file can be imported into ArcGIS. This
interoperability makes data sharing between organizations and systems easier and
reduces dependence on a single vendor or platform. It supports collaboration, open
data initiatives, and long-term data accessibility.

Data Analysis Toolbox


GIS (Geographic Information Systems) integrates spatial data (locations) with attribute data
(descriptions) to enable spatial analysis—the process of examining spatial relationships, patterns,
and trends. The data analysis toolbox in GIS provides a collection of geoprocessing tools that

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


manipulate, extract, and interpret spatial data layers. These tools are based on key spatial concepts
and operations drawn from geography, cartography, and computer science.
Vector Data Analysis
Buffering
Based on the concept of proximity, buffering creates two areas: one area that is within a specified
distance of select features and the other area that is beyond. The area within the specified distance
is the buffer zone. Features for buffering may be points, lines, or polygons. Buffering around
points creates circular buffer zones. Buffering around lines creates a series of elongated buffer
zones around each line segment. And buffering around polygons creates buffer zones that extend
outward from the polygon boundaries.
Buffering involves measuring distance outward in directions from an object. Buffering can be
done on all three types of vector data: point, line, area. The resulting buffer is a polygon file. A
buffer is a reclassification based on distance: classification of within/without a given proximity.
Buffering must cater for:
• point, line and polygon-based features
• different buffer shapes
• variable size buffers
• interior/exterior buffers (for polygons)
The result of all buffer operations is a polygon coverage which must have appropriate attributes
assigned. Each polygon must have an attribute that identifies whether or not it is a polygon inside
the buffer or outside the buffer.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Restricted Buffering

During the buffering process, the growth of buffers can be restricted by:

• Barriers
o prevent any movement through the barrier
▪ two types exist:
• absolute barrier - prevents movement entirely (e.g cliff, lake, fence,
forest, etc.)
• relative barrier - restricts movement at particular locations or times
(e.g narrow bridge, dried-up Salt Lake in summer, shallow streams,
etc.)

o Friction surfaces
▪ movement is restricted across a surface representing "cost of movement"
▪ a cost is incurred for movement, in effect slowing or restricting movement
while not preventing it entirely (e.g up-hill or down-hill slopes, swamps,
sandy soils, etc. may all contribute to reducing or increasing the buffer size)
▪ a layer of impedance values (providing cost of movement) can be used to
represent the friction surface.

Spatial overlay
An overlay operation combines the geometries and attributes of two feature layers to create the output.
The geometry of the output represents the geometric intersection of features from the input layers. The
spatial overlay function involves combining information (cells) from two or more layers to form a new
layer. Of course, the layers (and cells) must be geographically aligned (georeferenced) with each other in
order for the overlay to take place. Two types of overlay operations exist:

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


• Boolean overlays - combining cells using Boolean (also referred to as binary or logical) operators
(i.e mathematical set operators):
▪ AND - logical intersection
▪ OR - logical union, and
▪ NOT - logical negation

• Weighted overlays - combining cells using algebraic operators such as:


o arithmetic: addition "+", subtraction "-", multiplication "*", division "/", exponentiation
"^" (or "**")
o statistical: mean, maximum, minimum, variance, standard deviation
o merge:
o cover - one layer "covers" another, except where zeros occur.
o cross - assign a new value to each combination of values from the two layers

Weighted Overlay
The weighted overlay operation allows values other than binary to be included as cell values. The cell
values are then combined using arithmetic, statistical or merge operators (as already indicated). For
example, consider the problem: Can we predict our crop yield based on fertiliser rates and last year's crop
yields?

Reclassify and measure

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


The following functionality can be implemented in a grid GIS for reclassifying and measuring
data:

Reclassify - assign new values to: Measurement

• individual cells • distance


• regions o between two cells
• categories of old values at o along a set of cells with specified
equal intervals (slicing) values (ie. network of roads)
• groups of contiguous cells • size
(clumping) o area of region
o perimeter
o volume
• shape of region
• direction
• spatial arrangement (pattern)
.

Grid GIS functionality


Operations on one layer: Operations on multiple layers:

• display all/individual cells • overlay


• reclassify cells • boolean
• neighbourhood operations • weighted
• local • spreading (buffering) through:
• statistical (mean, max, etc.) • barriers
• calculate slope and aspect • friction (impedance) surfaces
• filtering data
• extended • viewshed analysis (intervisibility)
• create buffer zones
• obtain drainage paths
• interpolation
• measurement
• distance, size, shape
• transformations/projections

Display

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


The following functionality can be implemented in a grid GIS for displaying data:

View grid layer Contour

• colored/shaded grid • from point heights


• zoom in, zoom out • from elevation surface
• pan (move point of view)
Map layouts
Query/browse
• legend
• regions • north arrow and scale bar
• individual cells • title

3-D view Histograms, tables, graphs, charts, etc.

• perspective or orthogonal view


• panoramic view
• vertical profile
• Drape data on surface

Reclassify and measure


The following functionality can be implemented in a grid GIS for reclassifying and measuring
data:

Reclassify - assign new values to: Measurement


• individual cells • distance
• regions o between two cells
• categories of old values at o along a set of cells with specified
equal intervals (slicing) values (ie. network of roads)
• groups of contiguous cells • size
(clumping) o area of region
o perimeter
o volume
• shape of region
• direction
• spatial arrangement (pattern).

Pattern Analysis

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Pattern analysis is the study of the spatial arrangements of point or polygon features in two-
dimensional space. Pattern analysis uses distance measurements as inputs and statistics (spatial
statistics) for describing the distribution pattern. At the general (global) level, a pattern analysis
can reveal if a point distribution pattern is random, dispersed, or clustered. A random pattern is a
pattern in which the presence of a point at a location does not encourage or inhibit the occurrence
of neighbouring points. This spatial randomness separates a random pattern from a dispersed or
clustered pattern. At the local level, a pattern analysis can detect if a distribution pattern contains
local clusters of high or low values.

Feature Measurement
Feature manipulation in vector analysis refers to the range of operations used to alter, combine,
or extract spatial features represented as points, lines, and polygons in a vector dataset. This
process is fundamental in the management and analysis of geospatial data within Geographic
Information Systems (GIS). It supports effective decision-making in areas such as urban planning,
infrastructure development, environmental management, and resource allocation. The operations
described above are routinely performed using GIS software platforms such as ArcGIS and QGIS,
as well as spatial database extensions like PostGIS and other tools including GRASS GIS.
Vector data typically represents real-world phenomena in three forms: points for discrete locations
such as GPS stations or boreholes, lines for linear features like roads and rivers, and polygons for
area features such as land parcels and lakes. Through feature manipulation, these data types can
be modified or analyzed to reveal spatial patterns, support decisions, or integrate with other
datasets.
One basic form of feature manipulation is selection, which involves extracting features based on
specific criteria from the attribute table or spatial relationship. For example, selecting all road
features within a defined administrative region enables focused analysis. Another common
operation is clipping, which restricts the extent of features to a given boundary, such as limiting
land cover data to a study area. Union operations allow for the combination of two polygon layers,
preserving all input features and their attributes, often used in integrating thematic datasets like
land use and administrative zones.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Intersect operations identify and preserve only the overlapping areas between input layers,
making them useful for isolating regions that satisfy multiple spatial conditions, such as areas with
both high slope and forest cover. The dissolve function merges adjacent features that share a
common attribute, for instance, consolidating land parcels owned by the same individual into a
single polygon. Merge operations combine multiple layers of the same feature type into one
dataset, which is often done to integrate roads or rivers from different data sources.
Buffering creates zones of specified distances around point, line, or polygon features. These buffer
zones are essential in applications like environmental impact assessments or proximity analysis.
The erase or difference operation is used to remove parts of a feature that overlap with another
dataset, such as excluding urban regions from a vegetation map.

Network Analysis
Network analysis requires a network that is vector-based and topologically connected. Perhaps
the most common network analysis is shortest path analysis, which is used, for example, in in-
vehicle navigation systems to help drivers find the shortest route between an origin and a
destination. Network analysis also includes the traveling salesman problem, vehicle routing
problem, closest facility, allocation, and location-allocation.
A network is a system of linear features that has the appropriate attributes for the flow of objects.
For example, a road system is a familiar network. Other networks include railways, public transit
lines, bicycle paths, and streams. A network is typically topology-based: lines meet at
intersections, lines cannot have gaps, and lines have directions.
A network with the appropriate attributes can be used for a variety of applications. Some
applications are directly accessible through GIS tools. Others require the integration of GIS and
specialized software in operations research and management science. Shortest path analysis finds
the path with the minimum cumulative impedance between nodes on a network. Because the link
impedance can be measured in distance or time, a shortest path may represent the shortest route
or fastest route. Shortest path analysis typically begins with an impedance matrix in which a value
represents the impedance of a direct link between two nodes on a network and an ∞ (infinity)
means no direct connection. Link refers to a road segment defined by two end points also called
edges or arcs. Links are the basic geometric features of a network. A Link impedance is the cost
of traversing a link, which may be measured by the physical length or the travel time.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Converting between Model Data
In Geographic Information Systems (GIS), spatial data can be represented using different models
primarily the vector and raster models. Converting data from one model to another is often
necessary for analysis or visualization, especially when integrating data from diverse sources.
There are two common methods of converting from one model to another:

A. Rasterization is the process of converting vector data (points, lines, and polygons) into
raster format (a grid of cells or pixels). In this process, vector features are overlaid on a
grid, and each cell of the raster is assigned a value based on whether it intersects with the
vector feature and what attribute is selected for conversion. For example, a polygon map
of land use can be rasterized so that each pixel carries a code representing a land use type
such as agriculture, forest, or urban. This method is especially useful when preparing data
for analysis techniques that require raster input, such as terrain modelling, hydrological
modelling, or image classification.
Rasterization requires careful consideration of cell size (resolution), as it affects the
precision and detail of the output raster. A small cell size gives a more detailed raster but
increases file size and processing time, while a larger cell size may result in a loss of detail
or misrepresentation of narrow or small features.

B. Vectorization (Raster to Vector Conversion)


Vectorization is the process of converting raster data into vector format by identifying and
tracing boundaries or features within the raster. This method involves detecting and
delineating shapes or regions from raster cells that share similar values, and converting
these into vector geometries—points, lines, or polygons. For instance, a classified satellite
image (raster) showing different land cover types can be vectorized to produce polygon
features for each land cover class. Vectorization is commonly used when raster data needs
to be used in applications that require precise geometries or when attributes and topology
need to be associated with the features.
There are two approaches to vectorization: manual (digitizing over a raster) and automatic
(using algorithms to trace edges or regions). Automatic vectorization works best when
raster data has clear, consistent boundaries between features. However, it can introduce
noise or errors, particularly when the raster contains low-resolution or ambiguous pixel
patterns.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Data Accuracy
Accuracy is the inverse of error. Many people equate accuracy with quality but in fact accuracy
is just one component of quality. Definition of accuracy is based on the entity-attribute-value
model.
i. Entities = real-world phenomena
ii. Attribute = relevant property
iii. Values = Quantitative/qualitative measurements

An error is a discrepancy between the encoded and actual value of a particular attribute for a given
entity. “Actual value” implies the existence of an objective, observable reality. However, reality
may be:
• Unobservable (e.g., historical data)
• Impractical to observe (e.g., too costly)
• Perceived rather than real (e.g., subjective entities such as “neighbourhoods")

In fact, it is not necessary to posit an objective reality in order to assess accuracy, since all
geographical data are collected with the aid of a model that specifies -- implicitly or explicitly--
the required level of abstraction and generalization.
• This is the database “specification” and is closely related to the “terrain nominal” concept
of perceived reality (Salgé, 1995).
• The specification serves as the standard against which accuracy is assessed. Thus the
“actual” value is the value we would expect based on the specification (Brassel et al., 1995).
• Accuracy is always a relative measure, since it is always measured relative to the
specification.
• to judge fitness-for-use, one must judge the data relative to the specification, and also
consider the limitations of the specification itself (CEN, 1995).

2.1. Spatial Accuracy


• Spatial accuracy is the accuracy of the spatial component of the database. The
metrics used depend on the dimensionality of the entities under consideration.
• For points, accuracy is defined in terms of the distance between the encoded location
and “actual” location.
• Error can be defined in various dimensions: x, y, z, horizontal, vertical, total.
• Metrics of error are extensions of classical statistical measures (mean error, RMSE
or root mean squared error, inference tests, confidence limits, etc.) (American
Society of Civil Engineers 1983; American Society of Photogrammetry 1985;
Goodchild 1991a).

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


• For lines and areas, the situation is more complex. This is because error is a mixture
of positional error (error in locating well-defined points along the line) and
generalization error (error in the points selected to represent the line) (Goodchild
1991b).

2.2. Temporal accuracy


• Temporal accuracy is the agreement between the encoded and “actual” temporal
coordinates for an entity.
• Temporal coordinates are often only implicit in geographical data, e.g., a time stamp
indicating that the entity was valid at some time. Often this is applied to the entire
database (e.g., a map dated “1995”).
• More realistically, temporal coordinates are the temporal limits within which the
entity is valid (e.g., Pothole Q54D-35-021 existed between 2/12/96 and 8/9/96).
• Temporal accuracy is not the same as “database time”, which is the time the
information was entered into the database.
• Temporal accuracy is not the same as “currentness” (or up-to-dateness) which is
actually an assessment of how well the database specification meets the needs of a
particular application. A database can be temporal accurate but still out of date;
historical applications depend on such data.
2.3. Thematic Accuracy
• Thematic accuracy is the accuracy of the attribute values encoded in a database.
• The metrics used here depend on the measurement scale of the data:
A. Quantitative data (e.g., precipitation) can be treated like a z-coordinate (elevation)
and assessed using metrics normally used for vertical error (such as the RMSE).
B. Qualitative data (e.g., land use/land cover) is normally assessed using a cross-
tabulation of encoded and “actual” classes at sample of locations. This produces a
classification error matrix (confusion matrix).

Database Structure (Model)


The term database system refers to an organization of components that define and regulate the
collection, storage, management, and use of data within a database environment. From a general
management point of view, the database system is composed of the five major parts; hardware,
software, Users, procedures, and data.
The basic role of DBMS
Generally speaking, a DBMS facilitates the process of;
i. Defining a database; that is, specifying the data types, structures, and constraints to be
taken into account.
ii. Constructing the database; that is, storing the data itself into persistent storage.
iii. Manipulating the database.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


iv. Querying the database to retrieve specific data.
iv. Updating the database (changing values).

Efficient data management typically requires the use of a computer database. The collection of
data, usually referred to as the database, which contains information relevant to an enterprise. It
is also a shared, integrated computer structure that stores a collection of:
i) End-user data, that is, raw facts of interest to the end user.
ii) Metadata, or data about data, through which the end-user data are integrated and
managed.
Underlying the structure of a database is the data model: a collection of conceptual tools for
describing data, data relationships, data semantics, and consistency constraints. A data model
provides a way to describe the design of a database at the physical, Logical, and View level. The
quest for better data management has led to several models that attempt to solve the problem of
the disadvantages of the file system. These models represent different scholars’ school of thought
as to what a database is, what is its purpose, the types of structures that it should employ, and the
technology that would be used to implement these structures.
Briefly, we will discuss various types of data models but will extensively give more explanation
on relational and ER data model. There are a number of different data models that we will cover
in the text.

Hierarchical model.
This model was first implemented by IBM company which was designed for the Apollo program
in 1966. The hierarchical structure contains levels, or segments. A segment is the equivalent of a
file system’s record type. Within the hierarchy, a higher layer is perceived as the parent of the
segment directly beneath it, which is called the child. The hierarchical model depicts a set of one-
to-many (1:M) relationships between a parent and its children’s segments. It can also be easily
stored on tape media. Hierarchical data model has its own demerits that it does not depict (N:M)
relationships and has no data independence.

The Network Model


The Network model was standardized by Charles W. Bachman in 1969 for CODASYL
Consortium. It was created and to represent complex data relationships more effectively than the
hierarchical model, to improve database performance, and to impose a database standard. In the
network model, the user perceives the network database as a collection of records in N:M
relationships. However, unlike the hierarchical model, the network model allows a record to have
more than one parent. While the network database model is generally not used today, the
definitions of standard database concepts that emerged with the network model are still used by
modern data models.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U


Object-oriented Data Model (OODM)
In the object-oriented data model (OODM), both data and their relationships are contained in a
single structure known as an object. An OODM reflects a very different way to define and use
entities. Like the relational model’s entity, an object is described by its factual content. But quite
unlike an entity, an object includes information about relationships between the facts within the
object, as well as information about its relationships with other objects. Therefore, the facts within
the object are given greater meaning. The OODM is said to be a semantic data model because
semantic indicates meaning.

Relational Model
This model as published by Edgar F “Ted” Codd in 1970, after several years of work with IBM.
The relational model uses a collection of tables to represent both data and the relationships among
those data. Each table has multiple columns, and each column has a unique name. The relational
model is an example of a record-based model. Record-based models are so named because the
database is structured in fixed-format records of several types. Each table contains records of a
particular type. Each record type defines a fixed number of fields, or attributes. The columns of
the table correspond to the attributes of the record type. The relational data model is the most
widely used data model, and a vast majority of current database systems are based on the relational
model.

The Entity-Relationship Model. The entity-relationship (E-R) data model is based on a


perception of a real world that consists of a collection of basic objects, called entities, and of
relationships among these objects. An entity is a "thing" or "object" in the real world that is
distinguishable from other objects. The entity-relationship model is widely used in database
design.
Note: The ER model is the dominant database modelling and design tool including Relational
model because of the exceptional visual simplicity. Nevertheless, the search for better data
modelling tools continues as the data environment continues to evolve. For this course, more
emphasis will be made on relational and ER model later though many other models exist, recent
development of combination of two or more models.

Hybrid DBMS
Hybrid DBMS are the emerging trend that retain the advantages of the relational model and at the
same time provide programmers with an object-oriented view of the underlying data. These types
of databases preserve the performance characteristics of the relational model and the semantically
rich programmatic support of the object-oriented model. An example of a Hybrid Model is
ARC/INFO, ESRI Shape File, etc.

Lecture Note on SGI 317 Principles of GIS by Surv. Okoli, F.U

You might also like