0% found this document useful (0 votes)
17 views88 pages

Ann

Uploaded by

bassam.w.b.97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views88 pages

Ann

Uploaded by

bassam.w.b.97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

1

Hydroinformatics
• Synergetic use of modelling tools and ICT within single
methodological approach dealing with physical, social and
economic aspects of sustainable water resources Eng.
Hydrology
and
Computer
hydraulic
Environmental
engineering Modeling
Satellite imagery
Hydroinformatics
Hydroinformatics
ANN
Radar Management and ICT
Instrumentation

2
Branches of Hydroinformatics
• Big Data Management (Gathering, processing, Transferring , archive)
• Computational Hydraulic (Classic numerical method, FDM,FEM,BEM…)
• Remote Sensing (RS) and Geographic Information System (GIS)
• Information communication (via internet) See Next
• Soft Computing , Computational Intelligence Slide

3
Soft Computing
• Unlike hard computing schemes, which strive for exactness and full
truth, soft computing techniques exploit the given tolerance of
imprecision, partial truth, uncertainty and approximation for a
particular problem.
• Inductive reasoning plays a larger role in soft computing than in hard
computing. In effect, the role model for soft computing is the human
mind.

4
Components of Soft Computing
• Machine Learning & Artificial Intelligence Methods ; e.g. Artificial Neural
Networks (ANNs) , Support Vector Machine (SVM)…
• Evolutionary & Metaheuristic algorithms; e.g. Genetic Algorithm (GA) , Ant
Colony Optimization
• Fuzzy Logic (FL)
• Hybrid Methods, e.g., ANFIS , Genetic Programming
Natural-inspired
Methods

5
Data

Discrete Continuous

Binary

Nominal

Symmetrical Asymmetrical

6
Data mining continued

Knowledge tree:
1.Data Cleaning
2.Data Integration
3.Data Selection
4.Data Transformation
.Data mining
5.Pattern Evaluation
6.Knowledge Presentation

7
What is Data mining
Data mining is the process of analyzing data from different
perspectives and summarizing it into useful information

8
Data processing

What is Data Processing


The collection and manipulation of items
of data to produce meaningful information
Data
Processing

Post- Pre-
Processing Processing

9
Data processing continued

Data Preprocessing
• Why preprocess the data?
• Data cleaning
• Data integration and transformation
• Data reduction

10
Data processing continued

Why Data Preprocessing?


• Data in the real world is dirty
– Incomplete: lacking attribute values, lacking certain attributes of
interest, or containing only aggregate data
– Noisy: containing errors or outliers
– Inconsistent: containing discrepancies in codes or names
• No quality data, no quality mining results!
– Quality decisions must be based on quality data
– Data warehouse needs consistent integration of quality data

11
Data processing continued

Data Preprocessing
• Why preprocess the data?
• Data cleaning
• Data integration and transformation
• Data reduction

12
Data processing continued

Data Cleaning
Data cleaning tasks
– Fill in missing values
– Identify outliers and smooth out noisy data
– Correct inconsistent data

13
Data processing continued

Missing Data
• Data is not always available
– E.g., many tuples have no recorded value for several attributes, such
as customer income in sales data
• Missing data may be due to
– Equipment malfunction
– Inconsistent with other recorded data and thus deleted
– Data not entered due to misunderstanding
– Certain data may not be considered important at the time of entry
– Not register history or changes of the data
• Missing data may need to be inferred.

14
Data processing continued

How to Handle Noisy Data?


• Binning method:
– first sort data and partition into (equi- depth) bins
– Then smooth by bin means, smooth by bin median,
smooth by bin boundaries, etc.
• Clustering
– Detect and remove outliers
• Combined computer and human inspection
– Detect suspicious values and check by human
• Regression
– Smooth by fitting the data into regression functions

15
Data processing continued

Data Preprocessing
• Why preprocess the data?
• Data cleaning
• Data integration and transformation
• Data reduction

16
Data processing continued

Data Integration
• Data integration:
– Combines data from multiple sources into a coherent store
• Schema integration:
– Integrate metadata from different sources
• Detecting and resolving data value conflicts:
– For the same real world entity, attribute values from different
sources are different
– Possible reasons: different representations, different scales, e.g.,
metric vs. British units

17
Data processing continued

Data Transformation
• Smoothing: remove noise from data
• Aggregation: summarization, data cube construction
• Generalization: concept hierarchy climbing
• Normalization: scaled to fall within a small, specified range
– min-max normalization X  X min
X ( normal )  [ a  b ] b
– z-score normalization X max  X min
– normalization by decimal scaling

xt  1
 When we have negative data or zero:
X T  T( x t ) 

18
Data processing continued

Data Preprocessing

• Why preprocess the data?


• Data cleaning
• Data integration and transformation
• Data reduction

19
Data processing continued

Data Reduction Strategies


• Warehouse may store terabytes of data: Complex data
analysis/mining may take a very long time to run on the
complete data set
• Data reduction
– Obtains a reduced representation of the data set that is much
smaller in volume but yet produces the same (or almost the
same) analytical results
• Data reduction strategies
– Data cube aggregation
– Dimensionality reduction
– Numerosity reduction
– Discretization and concept hierarchy generation
20
Probability and Statistics

Probability, Statistics, and Decisions for Civil Engineers


1970,by Benjamin, J R, Cornell, C A
APPLIED STATISTICS FOR ENGINEERS CIVIL AND
ENVIRONMENTAL by Nathabandu T. Kottegoda
21
Probability and Statistics

Why we should learn Statistics and Probability?

From satellites continuously orbiting the globe to common social network sites,
data are being collected everywhere and all the time. Knowledge in statistics
provides you with the necessary tools to extract information intelligently from
this sea of data.
What is a MODEL and its types?

Deterministic
Process Physical
Formal
(Mathematical)
Stochastic
Black-
Conceptual White
Probability Box
Statistics -Box
e.g.,
ANN
22
Probability and Statistics continued

What is probability
• The quality or state of being probable; the extent to which
something is likely to happen or be the case; measured by the ratio
of the favourable cases to the whole number of cases possible.
Number of successful outcomes
P ( A) 
Number of possible outcomes

23
Probability and Statistics continued

Statistical Parameters
• Arithmetic Mean: The average of a set of numerical values, as calculated by adding them together and
dividing by the number of terms in the set.
• Weighted Arithmetic Mean: Is similar to an ordinary arithmetic mean (the most common type of average),
except that instead of each of the data points contributing equally to the final average, some data points
contribute more than others.
• Median: Is the value separating the higher half of a data
• Mode: The number which appears most often in a set of numbers.
• Variance: Is the expectation of the squared deviation of a random variable from its mean
• Standard Deviation: Is a measure that is used to quantify the amount of variation or dispersion of a set of
data values
• Coefficient of Variation: Is a standardized measure of dispersion of a probability distribution or frequency
distribution
• Skewness: Is a measure of the asymmetry of the probability distribution of a real-valued random variable
about its mean
• Covariance: Is a measure of the joint variability of two random variables
• Correlation Coefficient: Is a number that quantifies a type of correlation and dependence, meaning
statistical relationships between two or more values in fundamental statistics

24
Probability and Statistics continued

• Arithmetic Mean:

• Weighted Arithmetic Mean

• Covariance : cov( x, y )   x , y 
 ( x  x)( y  y )
i i

25
Probability and Statistics continued
n
1
• Variance:  2   ( xi  x) 2
n i 1
• Standard Deviation:
Hey!remember this  a 
M   ( x  a ) r f ( x ) dx
r 

• Coefficient of Variation:   2

• Skewness: 1

n
 ( xi  x) 3

• Correlation Coefficient :

26
Probability and Statistics continued

27
Probability and Statistics continued

Probability density function


• PDF is a function, whose value at any given sample (or point) in
the sample space (the set of possible values taken by the random
variable) can be interpreted as providing a relative likelihood that
the value of the random variable would equal that sample

• Some Properties of probability density function:

1. f(x)≥0 for any x

2. Total area under f(x) is 1



 0
f ( x ) dx  1
28
Probability and Statistics continued

Cumulative Distribution Function


• A function whose value is the probability that a corresponding
continuous random variable has a value less than or equal to the
argument of the function
• In other words, the cumulative distribution function F(x) is given by
the shaded area.
f(x) F ( x)  P( X  x)
P(a  X  b)  F (b)  F (a)
b
 a
f (u ) du

F(x)=P(X≤x)

x
29
30
Probability and Statistics continued

Moments in statistics
In our field
First moment of area is commonly used to determine the centroid of an area (r=1)
Second moment of area is also called Variance (r=2)
Third moment of area is also called Skewness (r=3)
Fourth moment of area is also called kurtosis (r=4)

  r
( x a ) f ( x)dx

a 
M r 

 f ( x)dx


  ( x  a ) f ( x)dx
a r
M r
0
31
Month Q(m3/s)
Probability and Statistics continued
1 * 1.5
Example 2 #4
3 @ 2.5
4 * 1.5
5 @ 2.5
6 #4
7 * 1.5

x
 x 27.5
  2.29
8
9
!2
@ 2.5
n 12 10 * 1.5
11 @ 2.5
12 * 1.5
 𝑥=27.5
5 1 4 2
x   xf ( x)  (12 1.5)  (12  2)  (12  2.5)  (12  4)  2.29

32
Probability and Statistics continued

Correlation coefficient
• Let X and Y be jointly distributed random variables. The
correlation between X and Y is
Cov ( X , Y )
  Corr ( X , Y ) 
 XY
-Measures the relative strength of the linear relationship
between two variables
-Unit-less
-Ranges between –1 and 1
-The closer to –1, the stronger the negative linear relationship
-The closer to 1, the stronger the positive linear relationship
-The closer to 0, the weaker any positive linear relationship

33
Probability and Statistics continued

Scatter Plots of Data with Various Correlation Coefficients


Qhat  computed Q
Q-hat Q-hat
Q-hat Q  measured Q

Q
Q Q
Q-hat
Q-hat Q-hat

Q
Q Q
 ( Qi Qˆi )
2

According to the last plot we cannot extrapolate correlation coefficient then  DC  1 


 ( Qi Q )
2

34
Probability and Statistics continued

-Normalization : Database normalization, or simply normalization,


is the process of organizing the columns (attributes) and tables
(relations) of a relational database to reduce data redundancy and
improve data integrity. A different approach to normalization of
probability distributions is quantile normalization, where
the quantiles of the different measures are brought into alignment

X  X min
X ( normal )  [ a  b] b
X max  X min

-Standardization : A standardized variable (sometimes called a z-


score or a standard score) is a variable that has been rescaled to
have a mean of zero and a standard deviation of one.
X X
z

35
Densities associated with multiple variables

n m
1)   P ( x , y )  1 9 ) f( x )   f ( x , y ) dy
i j 
i 1 j 1
x
k l 10 ) F( x )   f ( u ) du
2 ) F ( xk , yl )    P ( xi , y j )

i 1 j 1
m   s
3) P ( x i )   P ( xi , y j ) 11)  ( r , s )    x r y f( x , y ) dydx
j 1  
n  
4 ) P ( y j )   P ( xi , y j )
i 1 12 ) ( x , y )    ( x   x )( y   y ) f( x , y ) dydx  cov( x , y )
 
k m  xy
5 ) F ( x k )    P( x , y ) 13)  
i 1 j 1 i j  x y

6) F ( yl )  
n l
 P( x , y ) 14 ) DC  1 
 ( Qi Qˆ i ) 2
i 1 j 1 i j
 ( Qi Qi ) 2
 
7 )   f ( x , y ) dxdy  1
 

x y
8 ) F( x , y )    f ( u , v ) dudv
 
36
Probability and Statistics continued

Example
P( x 1.5, y 10 )  0.1

p( x  0.5 )  0.05  0.05  0.1

p( y 15 )  0.05  0.15  0.05  0.25

x   xf ( x )  ( 0.50.1)  (10.55 )  (1.50.4 )  ( 20.15 ) 1.3

y   yf ( y )  ( 50.2 )  (100.35 )  (150.25 )  ( 200.2 ) 12.25

2 2 2 2 2 2
S x   ( x  x ) f( x )  ( 0.5  1.3)  0.1  (1  1.3)  0.35  (1.5  1.3)  0.4  ( 2  1.3)  0.15  0.186

2 2 2 2 2 2
S y   ( y  y ) f( y )  ( 5  12.25 )  0.2  (10  12.25 )  0.35  (15  12.25 )  0.25  ( 20  12.25 )  0.2  26.189

 xy   ( x  x )( y  y ) f( x , y )  ( 0.5  1.3)( 5  12.25 ) 0.05  ( 0.5  1.3)(10  12.25 ) 0.05  (1  1.3)( 5  12.25 ) 0.1  (1  1.3)(10  12.25 ) 0.2  ...  1.45
37
Probability and Statistics continued

Linear regression
• In regression, one variable is considered independent (=predictor)
variable (X) and the other the dependent (=outcome) variable Y.
• Estimating the intercept and slope: least squares estimation:
If y    1 x1   2 x2
ŷ   xi  
1
  (  yi    xi )
 y  x    x    x1 1 2 2

 yx    x    x    x x
nn 2
n  xi yi   xi  yi 1 1 1 1 2 1 2
  i 1

 yx    x    x x    x
n  xi2  (  xi2 ) 2
2 2 1 1 2 2 1

38
Probability and Statistics continued

Example
• 1-According to studies on different basins, mean outflow discharge of each sub-basin (Q)
is related to its area (A) and number of rainy days (N) of it in each year:
• 10𝑄 = 𝛼𝐴𝛽1 𝑁𝛽2
• Using logarithm, a linear equation is resulted: x1  M  1 x2   2 x3
• 1 + 𝑙𝑜𝑔𝑄 = 𝑙𝑜𝑔𝛼 + 𝛽1 𝑙𝑜𝑔𝐴 + 𝛽2 𝑙𝑜𝑔𝑁
• If x1  1  log Q , x2  log A , x3  log N , M  log  determine correlation
• coefficient of M , 1 ,  2 via the data of various sub-basins that is given in table 1 using
linear regression.
• 2-Maximum instantaneous discharge of a river from 1926 to 1951 is given in table 2. First
of all, determine average, variance, skewness, PDF and CDF of this time-series. Secondly,
fit probability distributions of exponential, normal and Pearson and find out which one is
the better fit after which calculate the flood discharge during the next 5, 10 and 50 years.

39
Some common commands of MATLAB:
• +*/-\^% pi X(n,m),Size(X)
• Help,doc.f(x) D=det(X)
• Format long -Format short X=A.*B
• Factorial(x) sum(X)
• Sqrt(x) Who,length(X),Max(X),Min(X)
• Sin,cos,tan,cot,asin,acos,atan,acot Mean(X)
• Exp(x) Geomean(X)
• X= [1 2 3;4 5 6] Median(X)
• X=[1:10] Skewness(X)
• X= zeros(n,m) Mode(X)
• X= ones(n,m) Var(X)
• X=linspace(a,b,n) Std(X)
• X=rand(n,m) , X=randn(n,m) kurtosis(X)
• X=normrnd(M,S,m,n) corrcoef(X,Y)
• X= R’ regress(Y,X)
• Y=reshape(X,n,m) sort(X)
• X=eye(n) numel(X)
40
Artificial Neural Networks (ANN)

Contents
 Introduction
 History of Artificial Neural Networks (ANN)
 Overview of ANN
 Application of ANN

41
Introduction
 Artificial Neural Network (ANN) or Neural Network (NN) has provided an
exciting alternative method for solving a variety of problems in different
fields of science and engineering.
 This presentation tries to cover the subjects below:
 The whole idea about ANN
 The origin of ANN
 Mathematical concepts of ANN
 Outline some applications of ANN in water resources engineering

42
Origin of ANN
 Human brain has many incredible characteristics such as massive
parallelism, distributed representation and computation, learning
ability, generalization ability, adaptivity, which seems simple but is
really complicated
 It has been a long dream of computer scientists to build a
computer that can solve complex perceptual problems this fast.
 ANN models were result of such efforts to apply the same method
as the human brain uses.

43
What are Neural
Networks?
In machine learning and cognitive science:

 Models inspired by biological neural networks

 Estimating unknown functions using large


number of inputs

44
What is a model?

45
Mathematical Bernoulli equation,
Continuity equation

Model

Physical
A surcharge
modeled in
laboratory

46
Models
Distributed based on
(White Box)
physics
Mathematical

Conceptual

Model
Lumped Linear
(Black Box) Regression

Physical

47
As a mathematical
model:
 Non-linear regression
 A black-box model

48
Machine  An AI with ability to learn implicitly

Learning  Changes when exposed to new dataset

 Searches through data to find a pattern

49
Basis for learning in the brain

• Neural networks exhibit plasticity.

• Long-term changes in the strength of their


connections in response to the simulation
pattern

• Capable of forming new connections


with other neurons

50
Learning
What does it mean?
What is it’s source?
How does this process happen?

Learning Learning
Paradigm Algorithm
51
Learning
Paradigm
 Supervised Learning

• The correct answer is provided for the network for Feed Forward
every input pattern. Neural Network
(FFNN)
• Weights are adjusted regarding the correct answer.

 Unsupervised Learning
• Does not need the correct output.
• The system itself recognize the correlation and Clustering
organize patterns into categories accordingly.

52
Learning
Algorithm
 Error correction rules

 Boltzmann

 Hebbian

 Competitive learning

53
Data input

Structure of a neuron

Data Processor

Transferring Joint point of the


input signal to neurons
output

54
History of artificial neural
networks (Ann)
1943 McCulloch and Pitts Simple artificial model of the Threshold Logic
neuron

x1 W1

Y
x2 W2
Neuron Y

xn Wn
Threshold
b
55
History of artificial neural
networks (Ann)
1943 McCulloch and Pitts Simple artificial model of the Threshold Logic
neuron

56
History of artificial neural
networks (Ann)
1943 McCulloch and Pitts Simple artificial model of the Threshold Logic
neuron

x1 W1
1958 Rosenblatt Perceptron algorithm

x2 W2
1975 Werbos Back Propagation

57
Connection
Patterns

Feed-
Recurrent
Forward

58
Overview of ann

59
Connection
Patterns

Feed-
Recurrent
Forward

60
Overview of ann

61
Feed-
Forward

Multi
Layered
Perceptron

62
Feed-
Forward

MLP

63
STEP BY STEP CONSTRUCTION OF A mlp (FFNN)

64
STEP BY STEP CONSTRUCTION OF A mlp (FFNN)

65
STEP BY STEP CONSTRUCTION OF A mlp (FFNN)

66
Activation function
Function Formula

0


x 0
Hard Limiter 1


0 x

1
Sigmoidal 1e x

Hyperbolic Tangent
e 2x 1
e 2x 1

67
Activation function
Function Formula

0


x 0
Hard Limiter 1


0 x

1
Sigmoidal 1e x

Hyperbolic Tangent
e 2x 1
e 2x 1

68
Activation function
Function Formula

0


x 0
Hard Limiter 1


0 x

1
Sigmoidal 1e x

Hyperbolic Tangent
e 2x 1
e 2x 1

69
What must be done to build a model?
e.g. In modelling
• Parameters are
runoff we might
Sttt related to our
0 consider:
target?
Precipitation
Sttt • Data gathering Temperature
1
Evaporation
Sttt • Data
2
preprocessing
Sttt
• Construction, training
3 and verifying the model

70
What must be done to build a model?
• Parameters are
Sttt related to our
0
target? We need to
consider:
Sttt • Data gathering Quality
1
Quantity
Sttt • Data
2
preprocessing
Sttt
• Construction, training
3 and verifying the model

71
What must be done to build a model?
• Parameters are
Sttt related to our Processes like
0
target? normalizing the
• Data gathering data.
Sttt
1
x i  x min
Sttt • Data x max  x min
2
preprocessing
Sttt
• Construction, training
3 and verifying the model

72
What must be done to build a model?
• Parameters are The model must
Sttt related to our have an
0
target? acceptable DC
in both
Sttt • Data gathering calibration and
1
verification.
Sttt • Data x i  x min
A sensitivity
2
preprocessing x maxbe x min
analysis can
Sttt
• Construction, training done at this
3 and verifying the model
stage

73
More notes worth to mention on “Data gathering”

 The more data, the better! But must be economic!


Quantity
 Most of the time we don’t gather data, but if we do there
must be a trade of between quantity and the price $

Sample

Quality  Data must be heterogeneous

Data Domain

74
Overfitting (overlearning)
What does overlearning mean?

 When model fits to the train data very


precisely and looses it’s general nature.

 Model has a high DC at training stage but has


a pretty low DC in verification.

75
Overfitting (overlearning)

In what conditions it happens?


 Low diversity of data

 Number of hidden layer’s neuron or


epochs are not suitable

76
Weights
We need to
determine
Bias

77
Gradient
Descent

Back-
Propagation

78
Learning algorithm

79
Learning algorithm

80
Learning algorithm

81
Steps to modeling an Ann

Defining input and output data They must be


defined as matrices.
Defining percentages of training,
validation and test data

Defining initial hidden layer’s neurons


number

Choosing the training algorithm

82
Steps to modeling an Ann

Defining input and output data Usually:


70% for training
Defining percentages of training, 15% for validation
validation and test data 15% for test
Defining initial hidden layer’s neurons
number

Choosing the training algorithm

83
Steps to modeling an Ann
Hidden layer’s
Defining input and output data
neurons must be a
little more than
Defining percentages of training,
input layer’s
validation and test data
neurons.
Defining initial hidden layer’s neurons (Egg shaped)
number It is chosen with a
trial and error
Choosing the training algorithm method.

84
Steps to modeling an Ann

Defining input and output data

Defining percentages of training,


validation and test data
Mostly we use
Defining initial hidden layer’s neurons
Levenberg-
number
Marquardt which
Choosing the training algorithm contains GD,AL and
BP

85
ADVANTAGES DISADVANTAGES

 Can implicitly detect complex  Neural networks are black


non-linear relationships box and have limited ability
between independent and to explicitly identify possible
dependent variables causal relationships

 Ability to detect all possible  Requires large data sets


interaction between
predictor variables  Prone to overfitting

 Can be developed using


different training algorithms

86
Using ann in water
resources engineering

87
Using ann in water
resources engineering

88

You might also like