0% found this document useful (0 votes)
34 views118 pages

Etc PDF

Uploaded by

Pratyush Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views118 pages

Etc PDF

Uploaded by

Pratyush Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

Lecture Name :- Spatial Analysis

Prof. S. K. Ghosh
Dept of Civil Engg
Analysis in GIS

• Before starting any analysis, one needs to assess


the problem and establish an objective.
• It is important to think through the process before
making any judgments about the data or reaching to
any decisions;
• Ask questions about the data and model; and
• Generate a step-by-step procedure to monitor the
development and outline the overall objective.
BASIC PROCEDURE OF ANALYSIS

The following steps outline the basic procedure for


geographical analysis:

• Establish the objectives and criteria for the


analysis i.e. define the problem and then identify a
sequence of operations to produce meaningful
results.
• Prepare all map coverages for the proposed data
analysis and add one or more attributes to
coverages in the database if necessary.
• Perform the spatial operations and combine the
coverages, e.g.
– creating buffering zones around features,
– manipulating spatial features and
– performing polygon overlay.
• Prepare the derived data for tabular analysis and
make sure that the feature attribute table contains
all the items needed to hold the new values to be
created.
• Perform the tabular analysis i.e. Calculation and
query the relational database using the model
defined in Step 1.
• Evaluate and interpret the results and determine
whether the answers are valid.
– Simple map displays and reports can help in this
evaluation.
• Refine the analysis if needed and repeat the
analysis
SPATIAL DATA ANALYSIS
Data analysis in most GIS packages, including

• measurement techniques,
• query on attributes on proximity analysis,
• overlay operations, and
• analysis of models of surfaces and networks.
DATA ANALYSIS TERMINOLOGY

Entity: An individual point, line, or area in a GIS


database is called an entity.

Attribute: Attributes are characteristics of an entity.

In a vector GIS, attributes, such as name of a road,


average rainfall in an area, are stored in a database.
In a raster GIS, an attribute is the value of a cell in
the raster grid in the form of a numerical code, such
as ‘1’ for metalled road, ‘2’ for unmetalled road.
• Feature: A feature is an object in the real world to be
encoded in a GIS database.

• Data layer: A data layer is a data set for the area of


interest in a GIS, normally containing data of only one
entity type (i.e., points, lines, or areas).

• Image: A data layer in a raster GIS is called as image.

• Cell: An individual pixel in a raster is a cell.

• Function: Function or operation is a data analysis


procedure or operation performed by a GIS.

• Algorithm: A sequence of actions to be performed by


computer to solve a problem is known as an algorithm.
MEASUREMENT OF LENGTH,
PERIMETER AND AREA

• Measurement of length, perimeter, and area is


a common task in GIS.
• It is important to note that all measurements
from a GIS will be approximate due the reason
that vector data are made up of straight line
segments, and all raster entities are
approximated using a grid cell representation.
MEASUREMENT OF LENGTH

In a raster GIS, the Euclidean distance or the shortest path


between two points A and B can be measured by any one of
the methods given below.

(i)
=
By drawing a straight line between A and B

4 4.5 5.0 5.7

B, and computing its length as hypotenuse p


g
f

of a the right angled triangle ABC by 3


3.6
4.2 5.0

q
Pythagorean geometry 2 2.8
d
3.6
e

4.5

b c

i.e. AB = AC 2  CB 2 = 42  42
1 2 3 4

A C
a

= 5.7 units.
MEASUREMENT OF LENGTH

(ii) By measuring distances along raster B

cell sides Aa, ab, bc, ….. gB, and


4 4.5 5.0 5.7

g
p f

adding up them, i.e., 3


3.6
4.2 5.0

AB = (Aa)+(ab)+ ……. + (fg)+(gB),


q
e
d
2 2.8 3.6 4.5

or = 1+1+1+1+1+1+1+1 = 8 units b c

1 2 3 4

A C
a

(iii) By forming concentric equidistant zones around the


starting point A. Thus,
AB = 42  42 = 5.66 ≈ 5.7 units
AP = 22  32 = 3.61 ≈ 3.6 units
AQ = 42  22 = 4.47 ≈ 4.5 units
MEASUREMENT OF PERIMETER

• In the raster GIS, the area and perimeter measurements are


affected by location of origin and orientation of the raster grid,
and these problems are solved by proper selection of grids
with north-south alignment and use of consistent origins.
• Measurement of length, perimeter, and area in vector GIS is
easier and accurate compared to raster GIS. The length of
the line AB is calculated as below:

AB  ( xb  xa )  ( yb  y a )
2 2

where (xa, ya) and (xb, yb) are the coordinates of A and B,
respectively.
MEASUREMENT OF AREA
The area of a feature is Y E (xe, ye)

calculated by totaling the areas


D (xd, yd)
of simple geometric figures A (xa, ya)
formed by subdividing the
feature or directly by the
following formula B (xb, yb) C (xc, yc)

X
O

Area of ABCDEA =
1
xa ( yb  ye )  xb ( yc  ya )  xc ( yd  yb )  xd ( ye  yc )  xe ( ya  yd )
2

The perimeter is determined by summing up lengths of all the


lines. The calculated length, perimeter, and area can be stored
permanently in vector GIS database as attributes and to use
them when needed thus avoiding repetitive calculations.
QUERIES
• Performing query on the database is an essential part of GIS.
• Some queries may require the data that is the result of data
analysis.
• Generally the queries may be in the form.

(i) What is the shortest distance between two points ?


(ii) What is the route that will take minimum time to travel between
two points ?
(iii) What is the total area covered by forest?
(iv) How many police stations are with in a distance of 500 m
from a particular point ?
(v) What is the distance of a nearest hotel from a particular point ?
QUERIES
• Queries help in GIS analysis for checking
the quality of data and the results obtained.
• The queries may be classified as:

(i) Spatial query, and


(ii) Aspatial query.
QUERIES
• The queries, which require spatial analysis, fall
under the class of spatial query, while aspatial
queries use only attribute data of features involving
no spatial analysis.

• A query such as how many hospitals with heart care


facility are located in a given area is an aspatial
query as it can be performed by database software
alone and it does not involve analysis of the spatial
component of data.
• It is possible to combine the individual queries to
identify the entities in a database to satisfy two or
more spatial and aspatial criteria, for example
where are the hospitals which have more than 10
private wards.

• Boolean operators AND, NOT, OR, and XOR are


often used to combine different data sets by overlay.
RECLASSIFICATION
• Reclassification is another form of query in raster
GIS with only difference that it results in a new
image in which the features of different classes have
different codes.

• If cell value of a particular class of feature selected


is say 10, then in reclassification all the cells with
value 10 may be assigned a new value, say 1, and
the remaining cells with various classes of features
may be assigned a value, say 0.
• This will result in a new image which only
highlights a particular selected feature. This
new image is a Boolean image.

• It is also possible to create a new image by


reclassifying the features assigning new cell
values that indicate weightage of different
classes. The weightage may be based upon
certain criteria.
RECLASSIFICATION
11 11 11 11 11 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0
11 11 10 10 10 9 9 9 9 9 0 0 1 1 1 0 0 0 0 0
11 11 10 10 10 11 11 11 10 10 0 0 1 1 1 0 0 0 1 1
12 12 12 10 10 10 11 11 11 10 0 0 0 1 1 1 0 0 0 1

12 12 12 12 10 11 11 11 11 10 0 0 0 0 1 0 0 0 0 1
12 8 8 8 8 10 11 11 11 10 0 0 0 0 0 1 0 0 0 1
12 8 8 9 9 9 9 11 11 11 0 0 0 0 0 0 0 0 0 0
12 12 10 10 10 10 9 11 11 11 0 0 1 1 1 1 0 0 0 0
12 12 10 10 10 10 10 11 11 11 0 0 1 1 1 1 1 0 0 0
12 12 10 10 10 10 10 11 11 11 0 0 1 1 1 1 1 0 0 0
(a) Old cell values (b) New cell values
Land use Old cell value New cell values (Boolean) New cell value (weight)
Wetland 8 0 1
Water 9 0 2
Agriculture 10 1 5
Forest 11 0 4
Industrial 12 0 3
BUFFERING &
NEIGHBOURHOOD FUNCTIONS
• Buffering function, also known as proximity function, in
GIS is one of the neighbourhood functions and it is used
to create a zone of interest around an entity, or set of
entities.
• Buffering allows a spatial entity to influence its
neighbours, or the neighbours to influence the
character of an entity.
• Other neighbourhood function that include data filtering,
involves the recalculation of cells in a raster image
based on characteristics of neighbours. If a point is
buffered, a circular zone is created (Fig. a), and buffering
lines and areas creates new areas (Fig.b and c).
Different Types of Buffering Function

r
r1
r2

(a) Buffering point


d2
d
d1
d 

(b) Buffering line

d
d1 d 2

(c) Buffering area


Point Pattern Analysis
• Historically, Point Pattern Analysis was first
noted in the works of botanists and ecologists
in the 1930s.
• Point Pattern Analysis involves the ability to
describe patterns of locations of point events
and test whether there is a significant
occurrence of clustering of points in a
particular area.
• In general, Point Pattern Analysis can be used
to describe any type of incident data.
• For instance, we may want to conduct “Hot
Spot” analysis in order to better understand
locations of crimes, or else we may want to
study breakouts of certain diseases to better
see whether there is a pattern.
• In both of these cases, Point Pattern Analysis
can be of great help to institutions and
policymakers in their decisions on how to best
allocate their scarce resources to different
areas.
CRITERIA
• In order to conduct Point Pattern Analysis,
your data must meet five important criteria:

1. The pattern must be mapped on a plane,


meaning that you will need both latitude and
longitude coordinates.
2. A study area must be selected and
determined prior to the analysis.
3. The Point Data should not be a selected
sample, but rather the entire set of data you
seek to analyze.
4. There should be a one-to-one
correspondence between objects in the study
area and events in the pattern.
5. The Points must be true incidents with real
spatial coordinates. For example, using the
centroids of a census tract would not be an
especially useful process.
TECHNIQUES

1.) Quadrant Count Methods,


2.) Kernel Density Estimation (sometimes called K-
Means), and
3.) Nearest Neighbor Distance
QUADRANT COUNT METHODS
• This method involves simply recording and
counting the number of events that occur in
each quadrant.
• In general, it is important to remember that
large quadrants produce a very coarse
description of the pattern, but as quadrant size
is reduced, many areas may become too small
and some may contain no events at all.
• We will use the rule of thumb for the area of a
square is twice the expected frequency of points in a
Area
random distribution (i.e.,), 2 where n is the
n
number of points in the sample size .
• After partitioning the data set into quadrants, the
frequency distribution of the number of points per
quadrant has been constructed.
• The Mean and Variance of the sample are then
computed to calculate the Variance-to-Mean Ratio
(VTMR).
• The following is the way
we will interpret the
VTMR of a sample:
(1) If VTMR>1, the
pattern is clustered. This
implies that the data set
has one or more groups
of points in clusters and
large areas of maps
without points. The
region might look like as Clustered pattern
shown in Figure.
 If VTMR<1, the pattern
is regularly dispersed
implying the events are
distributed more or less
regularly over the
region.
 A regularly dispersed
area might look like as
shown in Figure Regularly Dispersed Pattern
• If VTMR=1, the
pattern is random.
This implies the data
set has no dominant
trend towards
clustering or
dispersion. A random
pattern may look like
as shown in Figure. Random Pattern
RANDOM UNIFORM CLUSTERED
3 1 2 2 0 0
5 0 2 2 0 0
2 1 2 2 10 10
1 3 2 2 0 0
3 1 2 2 0 0

Number of Number of Number of


Quadrat Points Per Quadrat Points Per Quadrat Points Per
# Quadrat x^2 # Quadrat x^2 # Quadrat x^2
1 3 9 1 2 4 1 0 0
2 1 1 2 2 4 2 0 0
3 5 25 3 2 4 3 0 0
4 0 0 4 2 4 4 0 0
5 2 4 5 2 4 5 10 100
6 1 1 6 2 4 6 10 100
7 1 1 7 2 4 7 0 0
8 3 9 8 2 4 8 0 0
9 3 9 9 2 4 9 0 0
10 1 1 10 2 4 10 0 0
20 60 20 40 20 200

Variance 2.222 Variance 0.000 Variance 17.778


Mean 2.000 Mean 2.000 Mean 2.000
Var/Mean 1.111 Var/Mean 0.000 Var/Mean 8.889

N  number _ of _ quadrats  10

Variance 
 x 2
 [( x) 2 / N ]
N 1
variance
Variance  mean  ratio 
mean
Weakness of Quadrat Analysis
• Quadrat size and orientation
– If the quadrats are too small, they may contain
only a couple of points. If they are too large,
they may contain too many points
• Some have suggested that quadrat size should
be twice the size of the mean area per point
• test different sizes (or orientations) to determine
the effects of each test on the results
Weakness of Quadrat Method
• Actually a measure of dispersion, and not
really pattern, because it is based
primarily on the density of points, and not
their arrangement in relation to one
another
• Results in a single measure for the entire
distribution, so variations within the
region are not recognized
KERNEL DENSITY ESTIMATION
• This method counts the incidents in an area (a
kernel), centered at the location where the
estimate is made.
• This analysis is a partitioning technique,
meaning that incidents are partitioned into a
number of different clusters.
• Often, the user is able to specify the number of
clusters.
• In some forms of this analysis, all incidents,
even the outliers, are assigned to one and only
one group.
ADVANTAGES
• Very good for analyzing the point patterns to
discover the Hot Spots.
• Provides with a useful link to geographical
data because it is able to transform the data
into a density surface.
• Choice of r, the kernel bandwidth strongly
affects our density surface.
• We can weight these patterns with other data
– such as density of populations and
unemployment rates.
NEAREST NEIGHBOR ANALYSIS
• Nearest neighbour analysis examines the distances
between each point and the closest point to it.

• The Nearest neighbour is a method of exploring


pattern in locational data by graphically comparing
the observed distribution functions of event-to-
event or random point-to-event nearest neighbour
distances, either with each other or with those that
may be theoretically expected from various
hypothesized models, in particular that of spatial
randomness, i.e. it describe distribution of points
according to their spacing.
NEAREST NEIGHBOUR INDEX
• The Nearest neighbour index measures the
degree of spatial dispersion in the distribution
based on the minimum of the inter-feature
distances, i.e. it is based on the distance
between adjacent point features.
• The distance between point features in a
clustered pattern will be smaller than in
a scattered (uniform) distribution with random
falling between the two.
• The equation for the nearest neighbour is
computed through the following steps
STEP 1

• Compute the average of nearest neighbour


distance of the point pattern using the eq:-
Ad = (di)/n
Where
di is the distance from point i to its nearest
neighbour
n is the total number of points in the chosen
map area.
STEP 2
• Compute the expected value of the average
nearest distance

Ed=1/2sqr(A/n)

where
A denotes the map area
• Compute the nearest neighbour index (NNI),
which is defined as the ratio of Ad to Ed

NNI=Ad/Ed

• The values of NNI range between two


theoretical extremes, 0 and 2.1491.

• When all the points in a pattern fall at the
same location, the pattern represents the
theoretical extreme of spatial concentration, in
this case, Ad = 0 and NNI = 0.
• The more closely the points are clustered together,
the closer to 0 NNI will be, since the average
nearest neighbour distance decreases.
• The closer NNI gets to 1, the more randomly spaced
the points are.
• The value of NNI approaches 2.1491 for perfectly
uniformly spaced points.
• Hence, the closer NNI is to 2.1491, the more
uniformly spaced the data are.
TYPICAL VALUES OF NNI
RANDOM UNIFORM CLUSTERED
Nearest Distance Nearest Nearest
Point Neighbour (r) Point Neighbour Distance Point Neighbour Distance
1 2 1 1 3 2.2 1 2 0.1
2 3 0.1 2 4 2.2 2 3 0.1
3 2 0.1 3 4 2.2 3 2 0.1
4 5 1 4 5 2.2 4 5 0.1
5 4 1 5 7 2.2 5 4 0.1
6 5 2 6 7 2.2 6 5 0.1
7 6 2.7 7 8 2.2 7 6 0.1
8 10 1 8 9 2.2 8 9 0.1
9 10 1 9 10 2.2 9 10 0.1
10 9 1 10 9 2.2 10 9 0.1
10.9 22 1

r 1.09 r 2.2 r 0.1


Area of Area of Area of
Region 50 Region 50 Region 50
Density 0.2 Density 0.2 Density 0.2
Expected Expected Expected
Mean 1.118034 Mean 1.118034 Mean 1.118034
R 0.9749256 R 1.9677398 R 0.0894427

r
 r
n
n r
d
area R
.5
r (e)
r (e) 
d
Advantages of Nearest Neighbor
over Quadrat Analysis
• No quadrat size problem
• Takes distance into account
• Problems
– Related to the entire boundary size
– Must consider how to measure the boundary
• Arbitrary or some natural boundary
– May not consider a possible adjacent boundary
FILTERING
• Filtering is one of the functions of
neighbourhood, used for processing of remote
sensing data.

• Filtering changes the value of a cell, which


depends on the attributes of the cells in
neighbourhood.

• A filter comprises of a group of cells around a


target cell, and its size and shape are decided
by the operator.
• A filter may have square or circular shape, and
its size determines the number of neighboring
cells used in the filtering process.
• The filtering process involves passing of a filter
of predetermined size and shape across the
raster data set to recalculate the value of the
target cell lying at the centre of the filter.
NOISE REMOVING FILTERS

• Mean filter
• Weighted mean filter
• Median filter
• Mode filter
• Olympic filter
MEAN FILTER
• In this technique, each pixel within
a given window (say 3×3) is a1 a2 a3

sequentially examined, and, if the a4 X a5


magnitude of the central pixel is
a6 a7 a8
significantly different than the
average brightness of its
Layout of Mean filter
surrounding pixels (as given by a
predetermined threshold value),
then the central pixel is replaced
by the average value
• The process can be represented
mathematically as

 8 
– If ABS X  1

 8 
i 1
ai   


(given threshold value)
8

a
1
– THEN X i
8 i 1

– ELSE X=X
where
X = the brightness value of the central pixel
ai = brightness value of the surrounding ith
pixel.
WEIGHTED MEAN FILTER

• A weighted mean is often used in which the


weight for a pixel is related to its distance from
the central pixel of the window.
• For example, if the processing of boundary
area of any region is to be undertaken, then
the different weights, inside and outside the
boundary are assigned such that the best
results for boundary as well as for region can
obtained.
MEDIAN FILTER

• Median filter uses median rather than average of


the neighborhood pixels of a given window.
• The median of a set of numbers is that value such
that 50% of the number are above and 50% are
below it.
• It is considered to be superior than the mean filter
primarily due to two reasons.
• First of all, the median of a set of ‘n’ numbers
(where n is odd) is always equal to one of the
value present in the data set. Second, median is
less sensitive to errors or to extreme data values.
• Conceptually simple, a median filter is
inefficient to implement because sorting of the
pixels in ascending or descending order is
required.
• However, it is one of the best noise removing
and edge preserving smoothing filter but thin
lines narrow than the dimension of the filter
window may be removed.
• This filter has no user – defined parameters.
MODE FILTER

• In this technique, the central pixel is replaced


by its most common neighbour.
• This is particularly useful in coded images
such as classification maps in which the pixel
values represent object labels.
• Averaging labels makes no sense, but mode
filters may clean up the isolated points.
• It produces irregular shifts in edges which
make them appear ragged.
OLYMPIC FILTER
• The Olympic filter is a variant of a mean filter.
• It is named after the system of scoring used in
certain Olympic events, where the highest and
lowest scores are dropped and the remaining
ones averaged.
• The Olympic filter ranks the values within the
filter window (number of values = N), and
discards high and low values before
calculating the mean of those remaining.
• The output of the Olympic filter is less
influenced by outlier values than the mean
filter, but the averaging process blurs edges
and removes fine detail from the images.
FILTERING

a b c d e f
a 4 4 4 4 4 4
Filter
b 2 2 3 3 3 4
Target cell
c 3 3 4 4 3 3
d 2 2 4 4 2 2

e 1 1 1 1 1 2

f 1 1 1 1 1 2
FILTERING
• The recalculated value of the target cell dc depends
upon the criteria used as given in Table.
• Filtering may be required in the raster data obtained from
a classified satellite images to smoothen the noise
present in the data due to high spatial variability in a
particular class of feature such as vegetation cover, or
due to the problems with the data collection devices.

Criteria Target cell Original Value New Value

Minimum filter dc 4 1
Maximum dc 4 4
Mean dc 4 2.67
Mode dc 4 4
Diversity dc 4 4
TEXTURE TRANSFORMATIONS
• Texture transformations are another set of
tools that are used to identify the spatial
pattern in data.

• Texture filters are designed to enhance the


heterogeneity of an area.

• A common algorithm for a texture filter is to


calculate the standard deviation of the cell
values in a 3  3 neighbourhood.
• If the attribute values in this neighbourhood
are all similar, the standard deviation is small,
and it is said that this neighbourhood has low
texture or low variability and in case there are
many different attributes in a neighbourhood,
we have high texture.

• Texture filters are also used to find the


boundaries between delimited areas, since
texture within a homogeneous area must be
zero.
SLOPE TRANSFORMATION
• A slope transformation turns a data layer of
elevation into one of slope, by calculating the
local first derivative. A companion to slope is
aspect. An aspect calculation is used to
determine the direction that a slope faces.
• A mathematical way of explaining aspect is to
calculate the horizontal component of the vector
perpendicular to the surface. Aspect is usually
classified into bins of fixed size, so that the
resulting data layer is not continuous, but
ordinal.
• A frequent choice is to classify slope into eight
categories, each representing an eighth of a
circle (or 45) range in aspect.
Other Neighbourhood Function
• The characteristics of a data set along a
specified line can also be determined by the
use of another neighbourhood function.
• Such a process is commonly presented in the
form of graphs as profiles or cross-sections of
data set.
• Profiles are used to determine the slope
along a line between two points for
construction of highway, canal, or to
determine the indivisibility between the two
points.
MAP OVERLAY

• Integrating two or more different thematic map layers


of the same geographic area is a common operation in
GIS analysis.
• The technique of map overlay has many applications,
such as for visual comparison by overlaying a map
showing only hospitals on a road network map, to
answer the query that ‘where are the hospitals
located’.
• In this case no new data are produced. This technique
is also used for the overlay of vector data on a raster
background image, which is a scanned topographic
map.
VECTOR OVERLAY
• In a vector-based system, the analysis is based on a
polygon intersection algorithm in which new polygons
are created as needed, and redundant boundaries are
eliminated (Fig.).
• Vector map overlay relies heavily on the two associated
disciplines of geometry and topology.
• The data layers being overlaid need to be topologically
correct boundaries so that the lines meet at nodes and
polygon boundaries are closed.
• To create topology for a new data layer produced as a
result of the overlay process, the intersection of lines and
polygons from the input layers need to be calculated
using geometry.
Merging and Redefinition of Features
Class-A
Residential
residences
area Residential
Class-B area
residences + Residential =

area

Business Commercial Commercial


centre area area
VECTOR OVERLAY
• There are three main types of vector
overlay (Fig.), which are:

(i) Point-in-polygon,
(ii) Line-in-polygon, and
(iii) Polygon-on-polygon.
POINT-IN-POLYGON
• When it is desired to locate a point with
respect to polygons, i.e., in which polygon a
point lies, point-in polygon overlay is used.
• Fig. shows two data layers containing wells
as points and soil types as polygons.
• The point-in-polygon overlay on these layers
results an output layer indicating that in
which soil category the wells lie.
• This information is given as an output map
of a new set of wells created with additional
attributes describing the soil type.
POINT-IN-POLYGON
Point map Polygon map Point map

Sand
+ +
1 + +
2 + =
Clay
3+ +

Well

Point attribute table


Point idSoil type (a) Point-in-polygon
1 Sand
2 Sand
3 Clay
LINE-IN-POLYGON

• To answer the queries like whether a road lies on


sandy or clay type of soil, line-in-polygon overlay is
used.

• Fig. shows data layers of roads and soil types.



• When the two layers are integrated, the roads are split
into smaller segments, depending on which part of a
road falls in which type of soil category.

• A database record of for each new road segment is


created in the output map. The output layer is more
complicated then the two input layers as topological
information is required to be retained, and therefore,
line-in-polygon is more complex.
LINE-IN-POLYGON
Line map Polygon map Line map

Sand 1
 3
= 2  +

2 + 
1 Clay 4
3 5

Road

Line attribute table


Old id New id Soil type
1 1 Sand
(b) Line-in-polygon
1 2 Clay
2 3 Sand
2 4 Clay
3 5 Clay
POLYGON-ON-POLYGON
• This overlay as shown in Fig. is used to
answer the queries such as,
(i) where are the different types of soil or areas
lying within urban area,
(ii) Where is the urban area boundary and
where clay types of soils within urban area
boundary are, or
(iii) Where are sandy soils within urban area.
The query of type:
(i) Involves Boolean (OR) operator, type
(ii) Involves (NOT) and the operation is referred
to as cookie cutting, and type
(iii) Involves (AND).
In mathematical terms these operations are
respectively referred to as UNION,
IDENTITY, and INTERESECT.
VECTOR OVERLAY OPERATIONS
Soil polygon map Urban polygon map Polygon map

Sand
Urban
+ =
Clay

Union

Sand
+ Urban =
Clay

Erase

Urban area on
Sand sandy soil
+ Urban =
Clay

Intersect (AND)
(c) Polygon-on-polygon
RASTER OVERLAY
• In a raster-based system, cells in input data represent
the raster data structure.
• A single cell represents a point, a string of cells
represents a line and a group of cells represents an
area.
• Raster overlays employ mathematical operations of
addition, subtraction, multiplication or division on the
individual cell values of the input layers to produce
output data.
• This requires appropriate coding of features
represented by points, lines, and areas the input data
layers.
RASTER OVERLAY
• For example, wells are represented as ‘1’ in
the well map layer while sewer lines are
expressed as ‘2’ in the sewer line map layer.
• In the land use map layer, the coding may be
‘3’ for wheat field, ‘4’ for forest, ‘5’ for clay soil
and ‘6’ for urban areas while for all the cells
having features of no interest as 0, in all data
layers.
• If the codes assigned to different land uses in
different layers are same, the interpretation of
results becomes different.
Operation Equivalent to Vector Point-in-Polygon
(i) Well station = 1
Well
(ii) Feature of no interest = 0
(i) Wheat = 3
Wheat
(ii) No feature of no interest = 0
(i) Wheat field with well = 4
(ii) Wheat field = 3
Output map
(iii) Well = 1
(iv) Neither well nor wheat field = 0
Wheat field Well within wheat field
Well

0 1 0 0 3 3 3 0 3 4 3 0
Well outside
0 0 0 0 3 3 0 0 3 3 0 0 wheat field
+ =
0 0 1 0 3 3 0 0 3 3 1 0

0 0 0 0 0 0 0 0 0 0 0 0
(a)
Well map Wheat field map Output map
Operation Equivalent to Vector Line-in-Polygon
(i) Sewer line = 2
Sewer line map
(ii) Feature of no interest = 0
(i) Urban = 6
Urban map
(ii) Feature of no interest = 0
(i) Sewer line in urban area = 8
(ii) Sewer line not in urban area = 2
Output map
(iii) Urban area with no sewer line = 6
(iv) Neither urban area nor sewer line = 0
Sewer line Urban Sewer line in urban area

0 0 0 0 6 6 6 0 6 6 6 0 Sewer line not


in urban area
0 2 2 2 6 6 0 0 6 8 2 2
+ =
2 0 0 0 6 6 0 0 8 6 0 0 Urban area with
no sewer line
0 0 0 0 0 0 0 0 0 0 0 0
Urban map Output map (b)
Sewer line map
Operation Equivalent to Vector Polygon on Polygon
(i) Clay soil = 5
Soil map
(ii) Feature of no interest = 0
(i) Forest = 4
Forest map
(ii) Feature of no interest = 0
(i) Forest on clay soil = 9
(ii) Forest not no clay soil = 4
Output map
(iii) Clay soil not in forest = 5
(iv) Neither forest nor clay soil = 0
Forest area on clay soil
Clay soil Forest Forest area not on clay
soil
0 0 5 5 4 4 4 0 4 4 9 5
Clay soil area
0 0 0 0 4 4 0 0 having no forest
4 4 0 0
+ =
5 5 5 0 4 4 0 0 9 9 5 0
5 0 0 0 0 0 0 0 5 0 0 0
Soil map (c)
Forest map Output map
Operation: Addition
(i) Clay soil = 1
Soil map
(ii) Feature of no interest = 0
(i) Forest = 1
Forest map
(ii) Feature of no interest = 0
(i) Forest on clay soil = 2
Output map (ii) Forest or clay soil = 1
(iii) Neither forest nor clay soil = 0

Forest Forest area


on clay soil
0 0 1 1 1 1 1 0 1 1 2 1
Forest or clay
Clay soil 0 0 0 0 1 1 0 0 1 1 0 0 soil

1 1 1 0 + 1 1 0 0 = 2 2 1 0

1 0 0 0 0 0 0 0 1 0 0 0

Soil map (d)


Operation: Multiplication
(i) Clay soil = 1
Soil map
(ii) Feature of no interest = 0
(i) Forest = 1
Forest map
(ii) Feature of no interest = 0
(i) Forest on clay soil = 1
Output map
(ii) Other areas = 0
Forest area on clay soil Other areas
Forest
Clay soil

0 0 1 1 1 1 1 0 0 0 1 0

0 0 0 0 1 1 0 0 0 0 0 0
× =
1 1 1 0 1 1 0 0 1 1 0 0

1 0 0 0 0 0 0 0 0 0 0 0
Soil map Forest map Output map
(e)
Analytical Hierarchy Process
(AHP)
• Developed by Saaty (1980) known as Saaty's AHP
(S-AHP).
• It is a system based analysis technique for solving
multi-criteria decision problems.
• Multi-criteria evaluation is primarily concerned
regarding the criteria for combining information
from several criteria to form a single index of
evaluation.
• The methodology of the AHP is to decompose a
complex decision problem into elemental issues
(i.e., the so-called clustering) create a
hierarchical model.
A hierarchy has at least three levels:

• focus or overall goal


• multiple criteria (attributes) that define alternatives
• competing alternatives

Alternative1 Alternative2

Criteria 1 Criteria 2

Goal
• Widely used in decision-making
• It is introduced with the basic assumption that
comparison of two elements is derived from their
relative importance.
• Comparisons are used to establish both weights
for criteria and preference scores for classes
on different criteria.
• This approach has provided a scale of nine
verbal range such as,
– Equally important,
– Moderate or
– Strongly important
so on……….
Ranking of qualitative
judgments
Intensity of Qualitative Definition
Importance
1 Equal importance
2 Equal to moderate
3 Moderate importance
4 Moderate to strong
5 Strong importance
6 Strong to very strong
7 Very strong
8 Very to extremely strong importance
9 Extreme importance
Compare the Sizes
of these Circles

A B C D E
A B C D E
A 1 7 9 5 3
B 1/7 1 3 1/5 1/7
C 1/9 1/3 1 1/7 1/9
D 1/5 5 7 1 1/3
E 1/3 7 9 3 1
Computation of criterion
weights
• Step I
– Compute the sum of each column,
• Step II
– Divide each entry in the matrix by its column sum
(Normalization)
• Step III
– Compute the average of the elements in each row of the
normalized matrix, i.e.,divide the sum of normalized scores for
each row by the number of criteria.
• Step IV
– Check to make sure the decision maker is consistent in making
the comparisons.
Consistency measurement
• Step I
Determine the weighted sum vector by
multiplying the corresponding weight times
the original pairwise comparison matrix.

• Step II
Determine the consistency vector by
dividing the weighted sum vector by the
criterion weights determined previously
• Consistency Index
N
CI 
N 1
λ = Consistency vector
• Consistency of Randomness

CI
CR 
RI
• CR<0.1 for consistency
Compute the sum of each column

STEP I
CRITERIA Rainfall SW NDVI Tmax RH
Rainfall 1.00 2.00 3.00 5.00 6.00
SW 0.50 1.00 2.00 3.00 5.00
NDVI 0.33 0.50 1.00 2.00 3.00
Tmax 0.20 0.33 0.50 1.00 2.00
RH 0.17 0.20 0.33 0.50 1.00

2.20 4.03 6.83 11.50 17.00


Divide each entry in the matrix by its column sum
(Normalization)
STEP
STEP II
III
CRITERIA Rainfall SW NDVI Tmax RH WT
Rainfall 0.45 0.50 0.44 0.43 0.35 0.44
SW 0.23 0.25 0.29 0.26 0.29 0.26
NDVI 0.15 0.12 0.15 0.17 0.18 0.15
Tmax 0.09 0.08 0.07 0.09 0.12 0.09
RH 0.08 0.05 0.05 0.04 0.06 0.06
Compute the average of the elements in each row
of the normalized matrix, i.e. divide the sum of
normalized scores for each row by the number of
criteria.
Consistency Check
STEP I

Rainfall 0.44 0.52 0.45 0.45 0.36


SW 0.22 0.26 0.30 0.27 0.28

NDVI 0.14 0.13 0.15 0.18 0.18


Tmax 0.09 0.09 0.08 0.09 0.12
RH 0.07 0.05 0.05 0.05 0.06

Determine the weighted sum vector by multiplying the


corresponding weight times the original pair wise
comparison matrix.
Consistency Check
STEP II CI RI CR
2.22 5.045
1.33 5.115
0.78 5.200
0.47 5.220
0.012 1.120 0.010
0.28 4.667

5.049

2.22/wt = 2.22/0.44 = 5.045


m 2 3 4 5 6 7 8 9 10
RI 0 0.58 0.90 1.12 1.24 1.32 1.41 1.45 1.51
SPATIAL AUTOCORRELATION
• Spatial autocorrelation may be defined as the relationship
among values of a single variable that comes from the
geographic arrangement of the areas in which these values
occur.
• It measures
– the similarity of objects within an area,
– the degree to which a spatial phenomenon is correlated to itself in
space,
– the level of interdependence between the variables,
– the nature and strength of the interdependence, i.e. spatial
autocorrelation is an assessment of the correlation of a variable in
reference to spatial location of the variable.
• Assesses if the values are interrelated, and if so, is there a
spatial pattern to the correlation, i.e. is there spatial
autocorrelation.
• Spatial autocorrelation tools tests whether the observed
value of a variable at one locality is independent of
values of the variable at neighboring localities.

• Spatial autocorrelation may be classified as either


positive or negative.
– Positive spatial autocorrelation has all similar values
appearing together
– Negative spatial autocorrelation has dissimilar values
appearing in close association.
• A positive spatial autocorrelation refers to a map
pattern where geographic features of similar value tend
to cluster on a map, whereas a negative spatial
autocorrelation indicates a map pattern in which
geographic units of similar values scatter throughout
the map.
• When no statistically significant spatial autocorrelation
exists, the pattern of spatial distribution is considered
random
DESCRIBING SPATIAL
AUTOCORRELATION
• The general method of describing autocorrelation in a
variable is to compute some index of covariance for a
series of lag distances (or distance classes) from each
point.
• The resulting correlogram illustrates autocorrelation at each
lag distance.
• Membership in a given distance class is defined by
assigning a weight to each pair of points in the analysis;
typically this weight is a simple indicator function, taking on
a value of 1 if within the distance class, else 0
• Weights may also be defined in other ways, in which case
they take on non integer values.
• The spatial pattern of a distribution is defined by the
arrangement of individual entities in space and the
geographic relationships among them.

• The capability of evaluating spatial patterns is a prerequisite


to understanding the complicated spatial processes
underlying the distribution of a phenomenon.

• Spatial autocorrelation indicates the extent to which the


occurrence of one feature is influenced by similar features in
the adjacent area.

• As such, statistics of spatial autocorrelation provide a useful


indicator of spatial patterns.
Measuring spatial autocorrelation

• There are three common methods for investigating


spatial autocorrelation.

• The statistical coefficients Moran’s I and Geary’s C


provide an indication of the type and degree of spatial
autocorrelation present in a data set.

• A graphical representation of spatial autocorrelation


can also be created by means of the semi-variogram
• To obtain the spatial autocorrelation coefficient of a
variable we have to correlate the values of that variable
for pairs of localities.

• However, not all pairs of localities will be correlated,


only those that are considered neighbors.

• Moran's indices and Geary's ratios (Moran’s I and


Geary’s c) - both are extended from spatial
autocorrelation coefficients for areal objects.

• Both indices allow using a single value to describe the


spatial distribution of features.
• They can be used to determine the degree of
adjustment necessary when modeling the phenomena.

• Moran’s I is produced by standardizing the spatial auto


covariance by the variance of the data using a measure
of the connectivity of the data.

• Geary’s C uses the sum of squared differences


between pairs of data values as its measure of co-
variation.
Geary's C
• This statistic was developed by Roy C. Geary

• Computation of Geary's C results in a value within the
range of 0 to +2.

• With zero being a strong positive spatial


autocorrelation, through to 2, which represents a
strong negative spatial autocorrelation.
• Geary's C can be defined as

Where:
N = the number of observations
Wij = the distance (spatial lag) between pixels
i and j
xi = the value at location i
xj = the value at location j
MORAN'S I
• Measures the correlation (simultaneous change in
value of two numerically valued random variables)
among the neighboring observations in the pattern

• Moran's I is defined as a measure of the correlation


among neighboring observations in a pattern.
• Computation of Moran's I is achieved by division of
the spatial co-variation by the total variation.
• The general formula for computing Moran's I is

where
N is the number of spatial units indexed by i and j;
X is the variable of interest;
• is the mean of X; and
• wij is a matrix of spatial weights
• Resultant values are in the range from
approximately -1 to 1.

• Positive signage represents positive spatial


autocorrelation, while the converse is true for
negative signage.

• With a Zero result representing no spatial


autocorrelation.
• Moran's I is inversely related to Geary's C, but
it is not identical.

• Moran's I is a measure of global spatial


autocorrelation, while Geary's C is more
sensitive to local spatial autocorrelation
NETWORK ANALYSIS
• In the context of GIS, a network is defined as
a set of interconnected linear features through
which resources can flow.
• Common examples of networks include
– highways, - railways,
– city streets, - canals,
– rivers,
– transportation routes, such as
• garbage collection, mail delivery, school buses, and
• utility distribution systems, for example,
– telephone, electricity, water supply, and sewage.
• There are many spatial problems that require
the use of network analysis for their solution.
Use of Network Analysis
• To find the shortest path (in terms of physical
distance or least cost) that can be followed to visit
a series of features in a network, known as path-
finding,
(ii) to assign one or more portions of a network to be
served by a facility or business location, called as
allocation,

(iii) to find all portions of the network that are


connected with the movement of a particular
feature (e.g., city transport), called as tracing,
Use of Network Analysis
(iv) to depict the accessibility of a location and the
interactions that occur between different
locations (based on a technique known as
gravity modeling). This is widely used in
economics, geography, engineering, and urban
planning, known as spatial interaction,
(v) to generate a distance matrix between different
pairs of locations in the network, known as
distance matrix calculation, and
(vi) to determine simultaneously the locations of
existing and planned facilities, as well as the
allocation of demand to these facilities, known as
location-allocation modeling.
SHORTEST PATH PROBLEM
• The shortest path which is the shortest distance (or
least-time path) between two points on a network,
is determined by proximity analysis in a raster GIS.
• Impediments to travel are added to a raster grid by
increasing the value of cells that are barriers to
travel. Then the result is obtained as least-cost
route.
• Vector GIS network analysis of finding shortest
path is more flexible and provides a thorough
impediment analysis such as restrictions and
congestion in traffic routes.
ROUTE TRACING
• Route tracing through the network analysis is
required to identify the routes for
unidirectional flow of resources with special
reference to stream networks and services,
such as sewerage systems and cable TV
networks.

• Other applications are determination of


streams contributing to a reservoir, customers
serviced by a particular sewer main, or clients
affected by a broken cable.
• The key concept in route tracing is
connectivity of network links at network nodes.
Knowledge of direction of flow is an important
factor for route tracing.
• Therefore, each link in the network must be
associated with direction of flow that can be
defined at the time of digitizing process by
keeping the directions of digitization and flow
same.
• Now tracing the links downstream or
upstream of a point on the network, is
performed by moving in the direction of flow or
against it.
Salesman Travel Problem
• A salesman may be required to visit a specific
set of clients in a day, for which he would like
to know the best (usually the quickest) route
that he can follow to finish his job.

• Similar problems are collection of garbage and


distribution of mails.

• Such problems are solved by analyzing the


order of stops and paths between them.
• Getting solutions of such problems, is a
complex task, and to simplify such tasks, the
ordering of the stops can be determined by
calculating the minimum path between each
stop and every other stop in the list based on
impedance met in the network.
• A trial and error method (also referred to as
heuristic) is then applied to order the visits
minimizing the total impedance from first stop
to the last.
Location-Allocation Modeling
• An important application of network analysis is
allocation of resources.
• This is done by modeling of supply and demand
which help of movement of goods, people, and
information or services through the network are
required to match the demand with the supply.
• Allocation of resources is usually done by
allocating links in the network to the nearest
center of supply taking into account impedance
values.
• The maximum catchment area of a particular
supply centre can also be determined on the
basis of the demand located along adjacent links
in the network.
Location-Allocation Modeling
• A situation may arise by imposing limitation on
supply and demand, in which some parts of the
network may not be serviced despite a demand
being present in that part.

• Such problems can be solved by reducing the


supply to some parts of the network or by
identifying the optimum location for a new centre
to meet the shortfall in supply relative to demand.
THANK
YOU

You might also like