IEEE_Conference_Template (3)
IEEE_Conference_Template (3)
Abstract—Least square (LS) channel estimation is a com- lLS estimation, MMSE estimation, and deep learning-based
mon technique used in many communication systems. Due to estimation.
noise enhancement during the estimation process, LS channel In general, the MMSE channel estimation method gives
estimation techniques tend to exhibit low performance at low
signal-to-noise ratio (SNR) regions. The minimum means square better performance than preamble-based LS estimation, par-
error (MMSE) channel estimation technique has high complexity ticularly in the presence of noise and other unknown effects.
compared to LS but it can achieve more accurate performance This is because MMSE channel estimation takes the statistical
than LS by taking the noise effect into consideration. The purpose properties of both the channel and the noise into consideration,
of this paper is to improve the LS channel estimation scheme and provides a more accurate estimate of the true channel
by correcting its error using Deep Learning (DL) methods. The
proposed DL-based channel estimation schemes outperform both response. In comparison, LS estimation assumes that the
the LS and MMSE channel estimation schemes in terms of channel response is deterministic, and ignores the presence of
performance, while also being less complex than the MMSE noise. As a result, LS estimation can be less accurate in real-
channel estimation method. world scenarios, particularly when the signal-to-noise ratio is
Index Terms—Channel Estimation, LS, MMSE, Deep Learning low or the channel response is time-varying. Ultimately, the
choice of channel estimation technique depends on the specific
application requirements and the available resources. While
I. I NTRODUCTION MMSE channel estimation can provide better performance
in certain scenarios, LS estimation may be more practical
Channel estimation is the process of determining the char- in others, particularly when computational complexity is a
acteristics of a communication channel that is used to transmit concern.
information from a sender to a receiver. It involves analyzing Deep learning (DL) gained significant attention in recent times
the received signal to estimate the channel’s parameters, such due to its impressive performance in a wide range of applica-
as attenuation, phase shift, and delay spread, which can be tions, such as speech recognition, natural language processing,
used to compensate for channel distortions and improve the computer vision, and robotics. In addition, DL algorithms can
quality of the received signal. Channel estimation is important achieve high accuracy with low computational complexity,
because it allows the receiver to compensate for the effects making them well-suited for physical layer applications in
of the channel, optimize system performance, and enhance communications, such as channel estimation.
the quality of the received signal, which leads to better In this paper, we propose DL-based channel estimation
SNR and improved error rates. There are various techniques schemes that achieve better performance than accurate MMSE
for channel estimation, including preamble-based estimation, with less complexity by using neural networks to correct LS
channel estimation errors. noise. The frequency domain channel gain H̃[n, j] is given
by
II. S YSTEM M ODEL H̃[n, j] = FN hj [n] (4)
This section introduces the system model used in our re-
A preamble of one or more OFDM symbols at the beginning
search and then discusses the Orthogonal Frequency Division
of the frame can be used to estimate the channel. let sp be the
Multiplexing (OFDM), OFDM transmitter and reciever, LS
data symbols transmitted by one OFDM symbol, the received
and MMSE channel estimation techniques.
preamble can be expressed as
A. Orthogonal Frequency Division Multiplexing (OFDM)
Y [n, p] = H̃[n, p]S[n, p] + Z[n, p], (5)
OFDM is a data transmission technique where a large data
stream in divided into several smaller narrowband subchannel Hence, the channel gain can be estimated from (5) which is
frequencies instead of a single Wideband channel frequency. given as
Instead of transmitting the whole large data stream serially, y˜p = Xp h˜p + z˜p ∈ C|Non |×1 , (6)
OFDM makes use of several smaller subchannels, and trans- The received signal ỹp is modeled as the product of the
mits the data parallelly. The subcarriers are orthogonal to each transmitted signal Xp and the channel response h̃p , with the
other. Orthogonality of 2 subcarriers means, that the integral addition of noise z̃. The noise z̃ is a complex-valued vector
of their product between a given time interval is zero. This of length |N on| that represents the noise and interference in
property helps in easily trasnmitting and receiving the data on the received signal.
different subcarriers. Here Non is the set of subcarriers that are allocated
In OFDM systems, preamble-based channel estimation is a with |Non | active subcarriers, y˜p = Y [Non , p], Xp =
common technique used to estimate the channel response. The diag {sp |Non |}, is a diagonal matrix of length |Non | that con-
preamble is a known sequence of symbols that is transmit- tains the transmitted data symbols sp for the active subcarriers.
ted before the data symbols. The receiver uses the received The notation diagsp |Non | indicates that the diagonal matrix
preamble to estimate the channel response by performing a has |Non | entries, each of which is equal to the corresponding
correlation operation between the received preamble and the entry in the vector sp , and h˜p = H̃[Non , p] is the channel’s
known preamble. The estimated channel response can be used frequency domain response at the preamble and allocated
to equalize the received data symbols and compensate for the subcarriers, which subject to estimation.
channel distortion. The goal of channel estimation is to estimate the channel
response h̃p so that it can be used to compensate for the distor-
B. OFDM Transmitter and Reciever
tion introduced by the channel. This allows for more accurate
Consider a frame that consists of J OFDM symbols. Con- decoding of the transmitted data symbols and improved system
sidering N number of subcarriers, sj data symbols, FN the performance.
N-DFT matrix, the jth transmitted OFDM symbol is given by
C. LS Channel Estimation Scheme
xj = 1 H
N FN si ∈ CN ×1 , (1) Considering that Xp is invertible, the LS channel estimation
Using a cyclic prefix (CP) of length Ncp the jth received using (6) can be simply expressed as
OFDM symbol is given by ˆ
h̃p,LS = X −1
p y˜p (7)
N ×1
yj = Hj xj + zj ∈ C , (2) Assuming that the channel remains static during the trans-
where zj ∈ C N ×1
the Additive gaussian noise with zero mean mission of Np preambles, averaging the estimates obtained
and variance σ0 and Hj ∈ CN ×N is the channel circular from these preambles can help to improve the accuracy of
matrix that is generated from the channel impulse response the channel estimation by reducing the impact of noise and
hj ∈ CN ×1 . fading on the estimated channel. The improved estimation of
Using the discrete Fourier transform (DFT) the frequency the LS channel is obtained by averaging over Np preambles
domain channel gain H̃[n, j] is obtained from the time domain as follows:
channel impulse response hj [n]. Np
Np
Np
In the frequency domain, the received signal Y [n, j] is given X X X
ỹ q = X p h̃p + z q ∈ C|Non |×1 (8)
by q=1 q=1 q=1
Y [n, j] = H̃[n, j]S[n, j] + Z[n, j], (3)
Assuming that the same preamble sequence S[n, q] is used
where S[n, j] is the transmitted data symbol on the nth for all Np preambles.e. S[n, q] = S[n, p], q = 1, .........., Np
subcarrier of the jth OFDM symbol, and Z[n, j] is the noise As a result, the channel gain at the nth sub-carrier can be
sample with variance N σ0 . therefore written as
From (3) we can observe that the received signal on each PNp
subcarrier is affected by the corresponding frequency domain ˆ q=1 Y [n, q]
channel gain, as well as the transmitted data symbol and the H̃ LS [n, p] = . (9)
Np S [n, p]
Where, Y [n, q] is the received signal at the n-th sub-carrier where ϵLS denotes the LS error and Σp is a diagonal matrix
and the q-th preamble. containing the eigenvalues of Rhp . The MMSE estimation
The estimation error variance is expressed as ˆ
error ϵM M SE = h̃p,M M SE − h̃p is given by
2
ˆ Np N σ0 N σ0 " −1 #
E H̃ LS [n, p] − H̃ [n, p] = 2 = N E ˆ N σ0
2
Np |S [n, p]| p p h̃p,M M SE = GH Σp Σp + I − I Gh̃p
(10) Ep
−1 (15)
N σ0
+GH Σp Σp + I GϵLS
2
Ep
where, Ep = |S[n, p]| represents the power per preamble
symbol. The accuracy of channel estimation can be improved Accordingly, the bias is given by
by averaging over multiple preambles and increasing the power " −1 #
per preamble symbol. N σ0
E [ϵM M SE ] = GH Σp Σp + I − I Gh̃p (16)
The LS is an efficient method for channel estimation that Ep
does not require any previous knowledge of the channel or
noise statistics. However, it does not benefit from any previous and the expression for the error autocorrelation matrix is
knowledge of the channel model. LS estimator requires only a " #2
single multiplication per subcarrier, making it computationally −1
H
N σ0
efficient. E ϵM M SE ϵM M SE = GH Σp Σp + I − I Σp G
Ep
−2
D. MMSE Channel Estimation Scheme N σ0 N σ0
+GH Σ2p Σp + I G
Ep Ep
The MMSE channel estimation [10] can be performed using (17)
−1
W M M SE H = Rhp X H
p X p Rhp X H
p + Rz̃ p The estimation error of the MMSE is less than the LS
(11) estimation error, and the difference depends on the channel’s
−H −1
= Rhp + X −1 X −1
p Rz̃ p p p non-zero singular values. MMSE is a biased estimator, which
h H
i means that its expected value may deviate from the true
E z̃ p z̃ H
Here, Rhp = E h̃p h̃p , Rz p = p ∈ channel response. However, at high SNR, the bias of MMSE
C|Non |×|Non |
are the channel and the noise autocorrelation estimation approaches zero, and the estimator becomes asymp-
matrices, respectively. totically unbiased. LS and MMSE have different trade-offs
For an AWGN channel and the preamble symbols are uncor- between complexity and performance. LS is simple but has
related and have equal energy i.e, X H poor performance at low SNRs. From (15), we can observe that
p X p = Ep I, we can
simplify the MMSE channel estimation to: MMSE performs additional processing on top of LS to reduce
the average estimation error. The MMSE is computationally
−1 3
complex than LS and has a complexity of order O(|Non | ).
N σ0
W M M SE H = Rhp Rhp + I X −1
p (12) However, if we know the channel autocorrelation matrix,
Ep
which is expressed as Rhp = GH Σp G the algorithm can
Here, Ep is the total energy of the preamble symbols, I is the 2
be simplified and its complexity can be reduced to 2 Non +
identity matrix, and Non is the number of subcarriers in the 2 |Non | complex multiplications, which is still higher than the
OFDM symbol used for channel estimation. LS estimator’s complexity of O(|Non |), but it is more feasible.
Thus, the MMSE channel estimation is given by The high computational complexity of MMSE motivates to
−1 explore more machine learning approaches to enhance channel
ˆ N σ0 estimation with lower computational complexity than MMSE.
h̃p,M M SE = Rhp Rhp + I X −1
p ỹ p (13)
Ep It should be noted that there are several MMSE approx-
imations available with lower performance and complexity
When using np preambles, the effective power changes to requirements.Here we just consider only accurate MMSE, and
np Ep . It should be noted that the MMSE can be viewed as a we show that our proposed DNN method perform better with
ˆ
correction to the LS estimation h̃p,LS = X −1
p y˜p . The MMSE less complexity.
ˆ
estimation of the channel h̃p,M M SE involves the eigenvalue
decomposition of the channel autocorrelation matrix Rhp i.e, III. P ROPOSED D EEP L EARNING M ETHOD
Rhp = GH Σp G , and therefore,
This section begins with a clear and concise explanation of
−1
ˆ N σ0 the DNN concept. Then, we go on to describe the design and
h̃p,M M SE = GH Σp Σp + I G h̃p + ϵLS , learning process of the suggested DNN schemes for channel
Ep
(14) estimation.
A. Overview of DNN B. Proposed DNN Schemes for Channel Estimation
DNN is a complex mathematical structure that consists of The proposed DNN aims to improve the accuracy of the
multiple layers of interconnected nodes or neurons. The basic LS estimated channel by learning the error in the estimation
structure of a DNN includes an input layer, one or more process. Specifically, the DNN is trained to minimize a cost
hidden layers, and an output layer. In supervised learning, (L)
function JΩ,b (H̃, y ˆ ) where H̃ is the true or ideal channel,
the DNN repeatedly passes input feature data from one (L)
H̃LS
layer of neurons to the next layer, with information being and y ˆ is the output of the DNN when the input is the
H̃LS
gathered and transferred at each stage.Each neuron in the LS estimated channel H̃ ˆ . The cost function includes two
LS
DNN receives weighted inputs from multiple other neurons, parameters, Ω and b, which are used to tune the DNN. The
and the performance of the model is optimized by selecting objective of the cost function is to reduce the difference
appropriate hyperparameters and passing the aggregated between the estimated channel and the true channel. By
inputs through an activation function. The output produced minimizing the cost function, the DNN can learn to correct
by the DNN is compared to the actual label value, and the the errors in the LS estimation, and thus improve the accuracy
model is trained by adjusting the weights of the connections of the channel estimation process.
between neurons to minimize the difference between the
In the proposed method, LS channel estimation is also
predicted output value and the actual label value.
applied to the received preamble to estimate the wireless
channel. The next step involves processing the LS estimate to
In a DNN with L hidden layers and an output layer, each
separate the real and imaginary parts of the estimated channel.
layer l (1 ≤ l ≤ L) contains Ml nodes or neurons. The input
This results in 2 |Non | inputs for the DNN, where |Non |
to each neuron in layer l is a weighted sum of the outputs
represents the number of channel taps in the wireless channel.
of all the neurons in the previous layer l − 1, denoted as
The DNN can process channel information more effectively
y (l−1) ∈ RMl−1 ×1 , where Ml−1 is the number of nodes
by separating the real and imaginary parts of the estimated
in layer l − 1.The weights associated with the connections
channel. It also reduces the complexity of the DNN, as the
between the neurons in layers l-1 and l are represented by a
number of input features is reduced to 2 |Non |.
weight matrix ω (l) ∈ RMl−1 ×1 , where ω (l, m) is the weight
vector associated with the mt h neuron in layer l. Each neuron Here the input data is the LS estimated channel, which is
in layer l also has a bias term b (l) associated with it. a vector of complex numbers with real and imaginary parts.
The output of the mth neuron in layer l, denoted as y (l,m) is To normalize this input data, each component of the vector is
computed as follows: first converted to a real number by taking the absolute value,
resulting in a vector of real values. The real-valued vector is
T
y (l,m) = fa(l,m) b(l,m) + ω (l,m) y (l−1) (18) then normalized to have a zero mean and unit variation by
subtracting the mean and dividing by the vector’s standard
(l,m) deviation. Normalizing the input data in this way ensures that
where fa is the activation function applied to the
weighted sum of inputs. The choice of activation function de- the DNN can learn the underlying patterns and relationships
pends on the problem being solved and can include functions in the data without being affected by differences in the scales
such as sigmoid, ReLU, or tanh. or magnitudes of the individual components. This can lead
The architecture of the network, including the number of to faster convergence during training and better generalization
layers and neurons in each layer, is determined. The weights performance on unseen data.
are randomly initialized before the training process begins. Once the inputs are processed, the DNN can be trained
The input data is fed through the network to produce an to learn the errors in the LS estimation and correct them
output prediction y (L) , where L is the last layer in the to improve the accuracy of the estimated wireless channel.
network. difference between the predicted output and the When training a DNN, the weights are updated iteratively over
actual output in the training set is measured using a suitable multiple epochs to minimize the cost function and improve
loss function y(Ω,b) . This loss function quantifies the accuracy the accuracy of the model. At the final stage of the training
of the network’s predictions. The goal is to minimize the phase, the final set of weights is used to make predictions on
error between the predicted and actual output. This is achieved new data. The proposed method uses early stopping to prevent
through an optimization method such as gradient descent with overfitting, saving only the weights from the epoch with the
backpropagation. In backpropagation, the error is propagated highest training and validation accuracy to ensure the best-
backwards through the network, and the gradients of the loss performing set of weights for the task at hand and improve
function with respect to the weights and biases are computed. generalization performance on new data.
These gradients are then used to update the weights and biases The loss function used in our proposed DNN is mean
in the opposite direction of the gradient, iteratively improving squared error (MSE), it measures the average squared differ-
the network’s performance. The optimization process contin- ence between the predicted and actual output values, and the
ues for a fixed number of iterations or until convergence is goal is to minimize this difference. The activation function
achieved, at which point the network weights are considered used is rectified linear unit (ReLU). It is defined as f (x) =
optimized for the given task. max(0, x) and has a simple, computationally efficient form
TABLE I activated. The performance of the proposed DL-based channel
H YPERPARAMETERS estimation schemes is compared to two benchmark methods:
Parameter Values LS and MMSE methods. The LS method estimates the channel
Number of Epochs 500 coefficients by minimizing the mean square error between the
Loss Function mean sqaure error
Optimizer ADAM
received pilot signals and the estimated channel. The accuracy
Activation Function RELU of the LS method is limited by the noise in the received signals.
batch size 32 The accurate MMSE method considers the noise statistics
and the channel statistics to obtain an optimal estimate of
TABLE II the channel coefficients. This method is more accurate than
D IFFERENT DNN A RCHITECTURES the LS method, but it also requires more computation. By
DNN Architecture Hidden Layers Neurons per layers comparing the performance of the proposed DL-based channel
DNN1 1 Non estimation schemes against these two methods, it is possible to
DNN2 1 2Non
DNN3 2 Non
evaluate their relative accuracy and computational efficiency.
DNN4 2 2Non The comparison can provide insights into the potential benefits
of using DL-based methods for channel estimation in practical
communication systems.
that allows for fast training. ReLU is also known to reduce The performance of the proposed DL-based channel esti-
the vanishing gradient problem, which can occur in deeper mation schemes depends on the SNR considered during the
networks when gradients become very small. It is used in training process. Specifically, training the DNN at a high SNR
all hidden layers except the output layer, where no activation value (e.g., 30 dB) yields the best performance because the
function is used. This is because the output layer is typically channel has a more significant impact on the received signals
designed to produce raw, unbounded values that can then be compared to the noise. The performance of the proposed DL-
post-processed or transformed into the desired output format. based channel estimation schemes is highly dependent on
Table II summarizes the four proposed DNN architectures the SNR used during training, and training at a high SNR
with different numbers of hidden layers and nodes per hidden value (e.g., 30 dB or higher) is recommended to achieve the
layer. The choice of architecture depends on the specific best performance. However, the DNN has good generalization
problem and dataset, and it is often determined through properties and can still estimate the channel at lower SNR
experimentation and hyperparameter tuning. values, although with slightly reduced accuracy.
(L)
At the end of the training, the DNN output, y ˆ , provides The performance of the DNN for channel estimation is
H̃LS
the corrected LS channel estimate for the received preamble. highly dependent on the signal-to-noise ratio (SNR) of the
This corrected channel estimate is obtained by processing communication channel. If the same DNN is trained on
the LS estimate with the DNN, which has been learned to different SNR values, it will not learn meaningful behavior.
correct the errors in the LS estimation process. The output Therefore, to achieve good performance, the SNR value needs
of the DNN is a vector of length containing both the real to be fixed for each trained DNN. The DNN is sensitive
and imaginary parts of the corrected channel estimates. to variations in the SNR and that the behavior learned by
To obtain the final corrected LS channel estimate, these the DNN may not generalize well to different SNR values.
values are combined to create Non complex-valued records, Therefore, it is crucial to carefully select the SNR value during
each of which represents a channel tap or subcarrier. These training and testing to ensure accurate channel estimation.
corrected channel estimates can then be used for further signal The results show that the DNN outperformed both classical
processing tasks, such as channel equalization, to improve least squares (LS) and minimum mean square error (MMSE)
the quality and reliability of wireless communication. The channel estimation schemes. The DNN was able to learn
accuracy of the corrected LS channel estimate depends on the higher-order statistics of the channel, which were not captured
quality and quantity of training data used to train the DNN, by the second-order statistics used in the MMSE approach.
as well as the complexity of the DNN architecture and the By learning higher-order statistics, the DNN can capture more
optimization techniques used during training. complex patterns and correlations in the channel, leading to
better estimation results. This result implies that the DNN has
the potential to improve channel estimation accuracy and it is
IV. S IMULATION R ESULTS AND A NALYSIS more robust to channel variations and noise than the classical
In this section, normalized mean square error (NMSE) LS and MMSE approaches.
simulations, followed by a computational complexity analysis
are presented to evaluate the performance of the proposed
DLbased channel estimation schemes. B. Computational complexity analysis
The number of multiplications required to compute the
A. NMSE Performance activation of all neurons in all layers of the network can be
The channel is assumed to be an OFDM channel with 64 used to determine the computational complexity of a DNN.
sub-carriers per OFDM symbol, where only 52 sub-carriers are The linear transformation from the lth to the (l − 1)th layer
requires Ml Ml−1 multiplications to compute the dot product [2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol.
of the weight matrix and the activation from the previous layer. 2. Oxford: Clarendon, 1892, pp.68–73.
[3] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange
As a result, the number of real-valued multiplications in the anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New
entire DNN network can be determined as follows: York: Academic, 1963, pp. 271–350.
L
[4] K. Elissa, “Title of paper if known,” unpublished.
X [5] R. Nicole, “Title of paper with only first word capitalized,” J. Name
Mmul = Ml−1 Ml , M0 = 2|Non |, ML = 2|Non |. (19) Stand. Abbrev., in press.
l=1 [6] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy
studies on magneto-optical media and plastic substrate interface,” IEEE
where L represents the number of DNN layers, Ml−1 repre- Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th
Annual Conf. Magnetics Japan, p. 301, 1982].
sents the number of neurons in the (l − 1)th layer, and Ml [7] M. Young, The Technical Writer’s Handbook. Mill Valley, CA: Univer-
represents the number of neurons in the lth layer. sity Science, 1989.
Finally, the input layer has 2|Non | neurons because the input
vector is represented as a sparse binary vector with |Non | non- IEEE conference templates contain guidance text for compos-
zero elements, and each non-zero element corresponds to a ing and formatting conference papers. Please ensure that all
separate neuron in the input layer. Similarly, the output layer template text is removed from your conference paper prior to
also has 2|Non | neurons to represent the binary output vector. submission to the conference. Failure to remove the template
The number of real-valued multiplications required for text from your paper may result in your paper not being
various DNN architectures is as follows, DNN1 requires published.
4|Non |2 multiplications, DNN2 requires 8|Non |2 multiplica-
tions, DNN3 requires 5|Non |2 multiplications, and DNN4 re-
quires 12|Non |2 multiplications.The accurate MMSE approach
has a computational complexity of order |Non |3 complex
multiplications, which is comparable to |Non |3 real-valued
multiplications. Therefore, the DNN approaches have a com-
putational complexity between the MMSE and simple LS
methods.
Despite having a lower computational complexity than the
MMSE approach, DNN architectures are able to provide a
good trade-off between computational complexity and accu-
racy for channel estimation tasks.
V. C ONCLUSION
This paper presents an innovative approach to channel
estimation using DL and demonstrates its potential to improve
the accuracy and reduce the computational complexity of
channel estimation, which can lead to significant benefits
in communication systems. Proposed a DL-based channel
estimation scheme that corrects the errors in the LS channel es-
timation caused by noise enhancement on transmitted OFDM
frames. First, classical channel estimation schemes have been
compared and then discussed their proposed DL-based channel
estimation methods, which are evaluated for different SNR
values.
One of the advantages of the proposed scheme is that it does
not require specific knowledge of channel statistics, making
it more suitable for real-world scenarios where the channel
conditions may vary. Simulation results show that the proposed
channel estimation schemes outperform the accurate MMSE
channel estimation with less computational complexity. This
suggests that the proposed scheme has the potential to be
applied to various communication systems and scenarios and
can lead to significant improvements in performance.
R EFERENCES
[1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of
Lipschitz-Hankel type involving products of Bessel functions,” Phil.
Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955.