Detecting Useful Features in Satellite Images
Using Convolutional Networks
One of the central tasks of map making is
extracting useful features from satellite
images. Today this still requires substantial
human involvement.
We approach this task as a per-pixel
classification. In this project we attempt to
classify pixels into Road, Water and
Landscape, using a convolutional network
We used the weights that we trained for
single pixel classification and created a
network for labelling entire images:
“Landscape” is a not well defined:
We can see one central source of error is the
category Landscape. Due to limitations of the
API used to obtain the labelled images, very
different physical objects all map to
Landscape. These can include buildings,
forests, stadiums, parks and parking lots.
This Issue can be addressed by switching to a
different API, and breaking Landscape into
“Man Made” and “Natural”
Adjacent pixels are not really independent:
The current structure of the network
classifies each pixels independently, resulting
in the irregular and discontinuous shapes we
see in the results. This issue can alleviated in
two ways:
• Training a network on larger images,
where the loss layer is evaluated over a
window of pixels. This approach is limited
by GPU memory constraints.
• Training a separate post-processor layer
that aggregates predictions over adjacent
pixels.
Task
Ideal Feature
Extraction
We train the following Network:
The non-linearities are ReLU units. No
pooling is used, to simplify transfer of
weights for labelling larger images. This does
not hurt performance but slows training.
Training
64x64
x3
62x62
x6
60x60
x12
58x58
x24
56x56
x48
96
3
Performance
Predicted
Label
True Label
Landscape Road Water
Landscape 70% 28% 2%
Road 32% 65% 3%
Water 5% 2% 93%
Road Water Landscape
Sources of Error
Flat-roofed building
classified as “Road”
Dark patch of forest
classified as “Water”
Additional source of error - noisy training labels: due to its
elevation and the image angle, satellite images show the Golden
Gate Bridge east of its true location. The network classifies this
image as “Road”, but its label is “Water”
CS231n
Convolutional Neural
Networks for Visual
Recognition

Map Feature Extraction from Satellite Imagery

  • 1.
    Detecting Useful Featuresin Satellite Images Using Convolutional Networks One of the central tasks of map making is extracting useful features from satellite images. Today this still requires substantial human involvement. We approach this task as a per-pixel classification. In this project we attempt to classify pixels into Road, Water and Landscape, using a convolutional network We used the weights that we trained for single pixel classification and created a network for labelling entire images: “Landscape” is a not well defined: We can see one central source of error is the category Landscape. Due to limitations of the API used to obtain the labelled images, very different physical objects all map to Landscape. These can include buildings, forests, stadiums, parks and parking lots. This Issue can be addressed by switching to a different API, and breaking Landscape into “Man Made” and “Natural” Adjacent pixels are not really independent: The current structure of the network classifies each pixels independently, resulting in the irregular and discontinuous shapes we see in the results. This issue can alleviated in two ways: • Training a network on larger images, where the loss layer is evaluated over a window of pixels. This approach is limited by GPU memory constraints. • Training a separate post-processor layer that aggregates predictions over adjacent pixels. Task Ideal Feature Extraction We train the following Network: The non-linearities are ReLU units. No pooling is used, to simplify transfer of weights for labelling larger images. This does not hurt performance but slows training. Training 64x64 x3 62x62 x6 60x60 x12 58x58 x24 56x56 x48 96 3 Performance Predicted Label True Label Landscape Road Water Landscape 70% 28% 2% Road 32% 65% 3% Water 5% 2% 93% Road Water Landscape Sources of Error Flat-roofed building classified as “Road” Dark patch of forest classified as “Water” Additional source of error - noisy training labels: due to its elevation and the image angle, satellite images show the Golden Gate Bridge east of its true location. The network classifies this image as “Road”, but its label is “Water” CS231n Convolutional Neural Networks for Visual Recognition