Semi-Blind Strategies for MMSE Channel Estimation Utilizing Generative Priors

Franz Weißer,  Nurettin Turan, 
Dominik Semmler,  Fares Ben Jazia, and Wolfgang Utschick
This work was supported by the Federal Ministry of Education and Research of Germany in the programme of “Souverän. Digital. Vernetzt.”. Joint project 6G-life, project identification number: 16KISK002. An earlier version of this work was presented at ICASSP’24[1]. The authors are with the TUM School of Computation, Information and Technology, Technical University of Munich, 80333 Munich, Germany (e-mail: [email protected]).
Abstract

This paper investigates semi-blind channel estimation for massive multiple-input multiple-output (MIMO) systems. To this end, we first estimate a subspace based on all received symbols (pilot and payload) to provide additional information for subsequent channel estimation. We show how this additional information enhances minimum mean square error (MMSE) channel estimation. Two variants of the linear MMSE (LMMSE) estimator are formulated, where the first one solves the estimation within the subspace, and the second one uses a subspace projection as a preprocessing step. Theoretical derivations show the superior estimation performance of the latter method in terms of mean square error for uncorrelated Rayleigh fading. Subsequently, we introduce parameterizations of this semi-blind LMMSE estimator based on two different conditional Gaussian latent models, i.e., the Gaussian mixture model and the variational autoencoder. Both models learn the underlying channel distribution of the propagation environment based on training data and serve as generative priors for semi-blind channel estimation. Extensive simulations for real-world measurement data and spatial channel models show the superior performance of the proposed methods compared to state-of-the-art semi-blind channel estimators with respect to the MSE.

Index Terms:
Semi-blind channel estimation, Gaussian mixture model, variational autoencoder, measurement data.

I Introduction

Accurate channel state information (CSI) is crucial for achieving the expected high data rates promised by multiple-input-multiple-output (MIMO) systems [2, 3, 4]. The CSI describes the communication link between transmitter and receiver, characterized by its time-varying and frequency-selective nature, which is prone to rapid changes making the task of channel estimation complex [5]. As accurate channel estimates are essential for the successful transmission of data, it is placed at the center of several research efforts [6, 7].

The most widely adopted methods in wireless communication utilize known training or pilot symbols transmitted across the channel using some of the radio resource blocks [8]. Afterward, the receiver uses the observed signals to determine a reliable CSI estimate. As the number of pilots scales with the number of users, the spectral efficiency decreases for higher number of users as less symbols are available for transmitting data. To enhance channel estimation without increasing the number of pilot symbols, various methods have been developed that leverage the information embedded in the observed data symbols at the receiver to infer channel characteristics [9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]. These methods exploit structure and redundancy within the transmitted data and yield more accurate CSI estimates.

The benefit of semi-blind channel estimation was first studied in [9], where Cramer-Rao bounds (CRBs) for blind, semi-blind, and training-based channel estimation were investigated in the context of single-input-multiple-output (SIMO) systems. In [10, 11], semi-blind channel estimation schemes based on maximum likelihood (ML) estimation were introduced. The asymptotic performance of the respective estimators was also studied in [10] for infinitely long data sequences. Another asymptotic behavior, where the number of antennas grows to infinity, was studied in [12]. Here, the authors identified two interference components in semi-blind channel estimation, which do not vanish even for large numbers of antennas. Early work on improving the least squares (LS) estimator using a semi-blind algorithm was conducted in [13]. The authors in [14] propose to use partially decoded data for channel estimation. Furthermore, in [15], semi-blind and blind channel estimation was studied to enhance the maximum a-posteriori (MAP) channel estimates in massive MIMO systems. These findings are based on favorable propagation, which only holds for large antenna arrays deployed at the base station (BS). In [16], two semi-blind channel estimators based on the expectation maximization (EM) algorithm are studied. The assumption of a Gaussian distribution for the data was verified, leading to a closed-form solution for the E-step. Another iterative framework optimizing the likelihood based on message passing (MP) is used in [17, 18] and references therein. In [19], a data-aided iterative scheme is proposed for orthogonal time frequency space (OTFS) systems by employing affine-precoded superimposed pilots. The performance improvement achieved with these iterative approaches generally requires computational costly updates. A low complexity iterative LS channel estimation algorithm is proposed in [20] for a massive MIMO turbo-receiver. In [21], the concept of semi-blind channel estimation was adapted for time domain synchronous-orthogonal frequency division multiplexing (OFDM) systems, where in addition to the pseudo noise sequence in the guard interval, the sent OFDM data symbols are exploited for the channel estimation. In [22], a framework was introduced for choosing reliable decoded data symbols, which can be interpreted as additional pilots. Similarly, reliably detected symbols are used for a semi-data-aided channel estimation in [23]. In [24], peak-power carriers in an OFDM system are selected to eliminate the need to determine reliable data symbols at the receiver. Recently, a diffusion model-based approach for joint channel estimation and detection was proposed in [25], where a diffusion process is constructed that models the joint distribution of the channels and symbols given noisy observations.

In this article, we focus on pilot-based estimators which minimize the MSE and investigate how these estimators can be extended for the semi-blind case. Notably, in [26] a subspace formulation of the MMSE estimator was used to mitigate pilot contamination in massive MIMO systems. The MMSE estimator is known to be the conditional mean estimator (CME) [27, Ch. 10], which, in general, is intractable and can not be computed in closed form. Recently, powerful approximations based on machine learning were presented in [28, 29, 30, 31, 32]. The benefit of machine learning is to enhance the task at hand by using prior information captured during the learning stage. For a given BS cell environment, the probability density function (PDF) representing potential user channels can be considered valuable prior information. Since this true underlying distribution is unknown, machine learning methods rely on a representative data set, which is assumed to be available at the BS. Based on this data set the PDF of the user channels can be learned. The first proposal of using a Gaussian mixture model (GMM) to formulate an estimator was done in [28] for the case of image processing. The approaches in [30, 31, 32] build on that by constructing a conditionally Gaussian latent model (CGLM) for the PDF of a BS cell environment. The learned CGLM not only enables MMSE channel estimation in [30, 31, 32] but can also be used for e.g., a limited feedback scheme as in [33]. In this work, we propose to utilize CGLMs to parameterize the CME in the semi-blind setting.

The contributions of this work are summarized as follows:

  • We introduce two variants of the linear MMSE (LMMSE) estimator incorporating subspace knowledge provided by the payload data symbols. First, we depict how the LMMSE channel estimator can solve a subspace estimation problem [26]. As an alternative, we propose a projection method that is computationally more efficient since it allows for the pre-calculation of LMMSE filters.

  • With theoretical derivations we show the superior MSE performance of the proposed projection method in the case of uncorrelated Rayleigh fading and perfect subspace knowledge.

  • We show how the GMM [30] and variational autoencoder (VAE) [32], instances of the class of CGLMs, can be used to parameterize the semi-blind LMMSE estimator.

  • Extensive simulations on different datasets, consisting of typical massive MIMO systems with multiple users and including real-world measurement data, show the superior performance of our proposed methods compared to state-of-the-art semi-blind channel estimation algorithms with respect to the MSE.

Preliminary results were presented in [1] and extended to the multi-user MIMO case in [34], which we extend further in the following aspects. The theoretical analyses in Section III enhance the foundation of the proposed semi-blind channel estimation strategies and provide analytic insights into the superior performance of the proposed projection method. We extend our concept of semi-blind MMSE channel estimation to the whole class of CGLMs, providing a more general framework to parameterize the semi-blind LMMSE estimator. Finally, we provide more comprehensive simulation results to show the strengths of our proposed strategies.

Notations: Matrices and vectors are denoted with boldface symbols. 𝟎0\bm{0}bold_0 and 𝐈Nsubscript𝐈𝑁\mathbf{I}_{N}bold_I start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT denote the zero vector of appropriate size and the identity matrix of size N×N𝑁𝑁N\times Nitalic_N × italic_N, respectively. 𝔼[]𝔼delimited-[]\mathbb{E}[\cdot]blackboard_E [ ⋅ ], tr()tr\mathrm{tr}(\cdot)roman_tr ( ⋅ ), range()range\mathrm{range}(\cdot)roman_range ( ⋅ ), and rank()rank\mathrm{rank}(\cdot)roman_rank ( ⋅ ) denote the expectation, trace, range, and rank operators, respectively. We use ()TsuperscriptT(\cdot)^{\mathrm{T}}( ⋅ ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, ()HsuperscriptH(\cdot)^{\mathrm{H}}( ⋅ ) start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT, ()1superscript1(\cdot)^{-1}( ⋅ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT to denote the transpose, conjugate transpose, and inverse. \|\cdot\|∥ ⋅ ∥ denotes the 2222-norm of a vector. 𝒩(𝝁,𝑪)subscript𝒩𝝁𝑪\mathcal{N}_{\mathbb{C}}(\bm{\mu},{\bm{C}})caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_μ , bold_italic_C ) denotes the circularly symmetric complex Gaussian distribution with mean 𝝁𝝁\bm{\mu}bold_italic_μ and covariance matrix 𝑪𝑪{\bm{C}}bold_italic_C.

II System and Channel Model

We consider a multi-user uplink system with J𝐽Jitalic_J single-antenna users and a BS equipped with M𝑀Mitalic_M receive antennas. The received signal vector at time instance n𝑛nitalic_n is then

𝒚(n)𝒚𝑛\displaystyle{\bm{y}}(n)bold_italic_y ( italic_n ) =𝑯𝒙(n)+𝒏(n),n=1,,N,formulae-sequenceabsent𝑯𝒙𝑛𝒏𝑛𝑛1𝑁\displaystyle={\bm{H}}{\bm{x}}(n)+{\bm{n}}(n),\quad n=1,...,N,= bold_italic_H bold_italic_x ( italic_n ) + bold_italic_n ( italic_n ) , italic_n = 1 , … , italic_N , (1)

where 𝒙(n)=[x1(n),,xJ(n)]TJ𝒙𝑛superscriptsubscript𝑥1𝑛subscript𝑥𝐽𝑛Tsuperscript𝐽{\bm{x}}(n)=[x_{1}(n),...,x_{J}(n)]^{\mathrm{T}}\in\mathbb{C}^{J}bold_italic_x ( italic_n ) = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_n ) , … , italic_x start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( italic_n ) ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT and 𝒏(n)M𝒏𝑛superscript𝑀{\bm{n}}(n)\in\mathbb{C}^{M}bold_italic_n ( italic_n ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT denote the signal sent by each of the J𝐽Jitalic_J users and the additive noise, respectively, whereas 𝑯=[𝒉1,,𝒉J]𝑯subscript𝒉1subscript𝒉𝐽{\bm{H}}=[{\bm{h}}_{1},...,{\bm{h}}_{J}]bold_italic_H = [ bold_italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_h start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ] contains the individual channels of the users 𝒉jMsubscript𝒉𝑗superscript𝑀{\bm{h}}_{j}\in\mathbb{C}^{M}bold_italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT. The case of multiple antennas at the users can be transformed into (1) by viewing each active stream as a different user. The corresponding channel 𝒉jsubscript𝒉𝑗{\bm{h}}_{j}bold_italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT can then be seen as the effective channel. For further details on semi-blind channel estimation in multi-user MIMO, we refer the reader to [34]. For the task of channel estimation, we consider a channel coherence interval larger than the number of snapshots N𝑁Nitalic_N, i.e., the channels are constant over all snapshots. We assume that the noise is Gaussian with 𝒏(n)𝒩(𝟎,𝑪𝒏=σ2𝐈M)similar-to𝒏𝑛subscript𝒩0subscript𝑪𝒏superscript𝜎2subscript𝐈𝑀{\bm{n}}(n)\sim\mathcal{N}_{\mathbb{C}}(\bm{0},{\bm{C}}_{\bm{n}}=\sigma^{2}% \mathbf{I}_{M})bold_italic_n ( italic_n ) ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ).

In conventional channel estimation schemes, each user’s signals include Npsubscript𝑁𝑝N_{p}italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT uplink pilots. These pilots are known to the BS. Hence, the received observations at the BS side are

𝒀=[𝒀p,𝒀d]=𝑯[𝑷,𝑫]+𝑵=𝑯𝑿+𝑵,𝒀superscriptsubscript𝒀𝑝subscript𝒀𝑑𝑯𝑷𝑫𝑵𝑯𝑿𝑵\displaystyle{\bm{Y}}=\left[{\bm{Y}}_{p}^{\prime},{\bm{Y}}_{d}\right]={\bm{H}}% \left[{\bm{P}},{\bm{D}}\right]+{\bm{N}}={\bm{H}}{\bm{X}}+{\bm{N}},bold_italic_Y = [ bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , bold_italic_Y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ] = bold_italic_H [ bold_italic_P , bold_italic_D ] + bold_italic_N = bold_italic_H bold_italic_X + bold_italic_N , (2)

where 𝒀M×N𝒀superscript𝑀𝑁{\bm{Y}}\in\mathbb{C}^{M\times N}bold_italic_Y ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N end_POSTSUPERSCRIPT, 𝒀pM×Npsubscriptsuperscript𝒀𝑝superscript𝑀subscript𝑁𝑝{\bm{Y}}^{\prime}_{p}\in\mathbb{C}^{M\times N_{p}}bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, 𝒀dM×NNpsubscript𝒀𝑑superscript𝑀𝑁subscript𝑁𝑝{\bm{Y}}_{d}\in\mathbb{C}^{M\times N-N_{p}}bold_italic_Y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N - italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, 𝑷J×Np𝑷superscript𝐽subscript𝑁𝑝{\bm{P}}\in\mathbb{C}^{J\times N_{p}}bold_italic_P ∈ blackboard_C start_POSTSUPERSCRIPT italic_J × italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, and 𝑫J×NNp𝑫superscript𝐽𝑁subscript𝑁𝑝{\bm{D}}\in\mathbb{C}^{J\times N-N_{p}}bold_italic_D ∈ blackboard_C start_POSTSUPERSCRIPT italic_J × italic_N - italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUPERSCRIPT denote all received observations, received pilot observations, received payload data observations, sent pilots, and sent payload data symbols, respectively. In order to fully illuminate the channels, the number of pilots is, at minimum, the number of users NpJsubscript𝑁𝑝𝐽N_{p}\geq Jitalic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ≥ italic_J, and orthogonal pilots are used. We set Np=Jsubscript𝑁𝑝𝐽N_{p}=Jitalic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_J, and utilize discrete Fourier transform (DFT) pilot sequences. After decorrelating the orthogonal pilot sequences the received pilot observations simplify to

𝒀p=𝒀p𝑷H=𝑯𝑷𝑷H+𝑵𝑷H=𝑯+𝑵p,subscript𝒀𝑝subscriptsuperscript𝒀𝑝superscript𝑷H𝑯𝑷superscript𝑷H𝑵superscript𝑷H𝑯subscript𝑵𝑝\displaystyle{\bm{Y}}_{p}={\bm{Y}}^{\prime}_{p}{\bm{P}}^{\mathrm{H}}={\bm{H}}{% \bm{P}}{\bm{P}}^{\mathrm{H}}+{\bm{N}}{\bm{P}}^{\mathrm{H}}={\bm{H}}+{\bm{N}}_{% p},bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT = bold_italic_H bold_italic_P bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + bold_italic_N bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT = bold_italic_H + bold_italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , (3)

where 𝑵psubscript𝑵𝑝{\bm{N}}_{p}bold_italic_N start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT has the same statistics as 𝑵𝑵{\bm{N}}bold_italic_N and, hence, we can omit the subscript. This enables to consider channel estimation from a per user perspective in the subsequent discussions. For reasons of simpler readability, the index for the respective user is, therefore, no longer given in the following. Consequently, we denote the pilot observation of a user as

𝒚p=𝒉+𝒏,subscript𝒚𝑝𝒉𝒏\displaystyle{\bm{y}}_{p}={\bm{h}}+{\bm{n}},bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = bold_italic_h + bold_italic_n , (4)

with 𝒏𝒩(0,𝑪𝒏=σ2𝐈M)similar-to𝒏subscript𝒩0subscript𝑪𝒏superscript𝜎2subscript𝐈𝑀{\bm{n}}\sim\mathcal{N}_{\mathbb{C}}(0,{\bm{C}}_{\bm{n}}=\sigma^{2}\mathbf{I}_% {M})bold_italic_n ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( 0 , bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ).

II-A Spatial Channel Model

We consider a spatial channel model based on [35], where the channel vectors are considered as conditionally Gaussian distributed [29]

𝒉𝜹𝒩(𝟎,𝑪𝜹),similar-toconditional𝒉𝜹subscript𝒩0subscript𝑪𝜹\displaystyle{\bm{h}}\mid{\bm{\delta}}\sim\mathcal{N}_{\mathbb{C}}\left(\bm{0}% ,{\bm{C}}_{\bm{\delta}}\right),bold_italic_h ∣ bold_italic_δ ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_C start_POSTSUBSCRIPT bold_italic_δ end_POSTSUBSCRIPT ) , (5)

based on a set of parameters 𝜹𝜹{\bm{\delta}}bold_italic_δ, which describe the directions and properties of the multi-path propagation clusters. The main angles are drawn independently from a uniform distribution in [0,2π]02𝜋[0,2\pi][ 0 , 2 italic_π ] and the path gains are independent zero-mean Gaussians. The spatial covariance matrix is given by

𝑪𝜹=ππg(ϑ,𝜹)𝒂(ϑ)𝒂H(ϑ)dϑ,subscript𝑪𝜹superscriptsubscript𝜋𝜋𝑔italic-ϑ𝜹𝒂italic-ϑsuperscript𝒂Hitalic-ϑdifferential-ditalic-ϑ\displaystyle{\bm{C}}_{\bm{\delta}}=\int_{-\pi}^{\pi}g(\vartheta,{\bm{\delta}}% ){\bm{a}}(\vartheta){\bm{a}}^{\mathrm{H}}(\vartheta)\mathrm{d}\vartheta,bold_italic_C start_POSTSUBSCRIPT bold_italic_δ end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT - italic_π end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_π end_POSTSUPERSCRIPT italic_g ( italic_ϑ , bold_italic_δ ) bold_italic_a ( italic_ϑ ) bold_italic_a start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ( italic_ϑ ) roman_d italic_ϑ , (6)

where g(ϑ,𝜹)𝑔italic-ϑ𝜹g(\vartheta,{\bm{\delta}})italic_g ( italic_ϑ , bold_italic_δ ) is the power density consisting of a weighted sum of Laplace densities, which have standard deviations σASsubscript𝜎AS\sigma_{\text{AS}}italic_σ start_POSTSUBSCRIPT AS end_POSTSUBSCRIPT corresponding to the angular spread of the propagation clusters. The BS employs a uniform linear array (ULA) with M=64𝑀64M=64italic_M = 64 antennas and λ/2𝜆2\lambda/2italic_λ / 2 spacing. The steering vector is then given as

𝒂(ϑ)=1M[1,ejπsin(ϑ),,ejπ(M1)sin(ϑ)]T.𝒂italic-ϑ1𝑀superscript1superscriptej𝜋italic-ϑsuperscriptej𝜋𝑀1italic-ϑT\displaystyle{\bm{a}}(\vartheta)=\frac{1}{\sqrt{M}}\left[1,\mathrm{e}^{-% \mathrm{j}\pi\sin(\vartheta)},\dots,\mathrm{e}^{-\mathrm{j}\pi(M-1)\sin(% \vartheta)}\right]^{\mathrm{T}}.bold_italic_a ( italic_ϑ ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_M end_ARG end_ARG [ 1 , roman_e start_POSTSUPERSCRIPT - roman_j italic_π roman_sin ( italic_ϑ ) end_POSTSUPERSCRIPT , … , roman_e start_POSTSUPERSCRIPT - roman_j italic_π ( italic_M - 1 ) roman_sin ( italic_ϑ ) end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT . (7)

For every channel sample, we consider a new 𝜹𝜹{\bm{\delta}}bold_italic_δ and draw the sample according to 𝒉𝒩(𝟎,𝑪𝜹)similar-to𝒉subscript𝒩0subscript𝑪𝜹{\bm{h}}\sim\mathcal{N}_{\mathbb{C}}\left(\bm{0},{\bm{C}}_{\bm{\delta}}\right)bold_italic_h ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_C start_POSTSUBSCRIPT bold_italic_δ end_POSTSUBSCRIPT ).

II-B Measurement Campaign

Since synthetic data capture real-world CSI characteristics only up to some extent, we utilize real-world data from a measurement campaign conducted at the Nokia campus in Stuttgart, Germany, in October/November 2017, cf. [36]. The BS antenna with a uniform rectangular array (URA) comprises Nv=4subscript𝑁𝑣4N_{v}=4italic_N start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = 4 vertical (λ𝜆\lambdaitalic_λ spacing) and Nh=16subscript𝑁16N_{h}=16italic_N start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = 16 horizontal (λ/2𝜆2\lambda/2italic_λ / 2 spacing) single polarized patch antennas. The operating carrier frequency is 2.182.182.182.18 GHz and the antenna was mounted on a rooftop approximately 20202020 meters above the ground. For further details, we refer the reader to [36].

III Semi-Blind Channel Estimation using Perfect Statistical Knowledge

In this section, we introduce two variants of the LMMSE estimator incorporating information provided by the payload data symbols. To do so, we first restrict ourselves to the case where perfect statistical knowledge is available at the receiver.

In channel estimation, commonly, only the pilot observation 𝒚psubscript𝒚𝑝{\bm{y}}_{p}bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is considered for channel estimation. The MSE optimal estimator given the pilot observation 𝒚psubscript𝒚𝑝{\bm{y}}_{p}bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the CME

𝒉^CME=𝔼[𝒉𝒚p].subscript^𝒉CME𝔼delimited-[]conditional𝒉subscript𝒚𝑝\displaystyle\hat{{\bm{h}}}_{\text{CME}}=\mathbb{E}\left[{\bm{h}}\mid{\bm{y}}_% {p}\right].over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT CME end_POSTSUBSCRIPT = blackboard_E [ bold_italic_h ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ] . (8)

If the channel sample 𝒉𝒉{\bm{h}}bold_italic_h is drawn from a Gaussian distribution according to

𝒉𝒩(𝟎,𝑪),similar-to𝒉subscript𝒩0𝑪\displaystyle{\bm{h}}\sim\mathcal{N}_{\mathbb{C}}(\bm{0},{\bm{C}}),bold_italic_h ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_C ) , (9)

and if further this statistic is known at the receiver, the genie-aided CME can be formulated as [27, Ch. 10]

𝒉^CME=𝑪(𝑪+𝑪𝒏)1𝒚p.subscript^𝒉CME𝑪superscript𝑪subscript𝑪𝒏1subscript𝒚𝑝\displaystyle\hat{{\bm{h}}}_{\text{CME}}={\bm{C}}\left({\bm{C}}+{\bm{C}}_{\bm{% n}}\right)^{-1}{\bm{y}}_{p}.over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT CME end_POSTSUBSCRIPT = bold_italic_C ( bold_italic_C + bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT . (10)

This LMMSE estimator achieves the MSE of

MSEplain=tr[𝑪𝑪(𝑪+𝑪𝒏)1𝑪].superscriptMSEplaintrdelimited-[]𝑪𝑪superscript𝑪subscript𝑪𝒏1𝑪\displaystyle\mathrm{MSE}^{\mathrm{plain}}=\mathrm{tr}\left[{\bm{C}}-{\bm{C}}(% {\bm{C}}+{\bm{C}}_{\bm{n}})^{-1}{\bm{C}}\right].roman_MSE start_POSTSUPERSCRIPT roman_plain end_POSTSUPERSCRIPT = roman_tr [ bold_italic_C - bold_italic_C ( bold_italic_C + bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ] . (11)

For 𝑪𝒏=σ2𝐈Msubscript𝑪𝒏superscript𝜎2subscript𝐈𝑀{\bm{C}}_{\bm{n}}=\sigma^{2}\mathbf{I}_{M}bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, the MSE can be expressed using the Woodbury identity as

MSEplain=i=1Mρiσ2ρi+σ2,superscriptMSEplainsuperscriptsubscript𝑖1𝑀subscript𝜌𝑖superscript𝜎2subscript𝜌𝑖superscript𝜎2\displaystyle\mathrm{MSE}^{\mathrm{plain}}=\sum_{i=1}^{M}\frac{{\rho}_{i}% \sigma^{2}}{{\rho}_{i}+\sigma^{2}},roman_MSE start_POSTSUPERSCRIPT roman_plain end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (12)

where ρisubscript𝜌𝑖\rho_{i}italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the eigenvalues of 𝑪𝑪{\bm{C}}bold_italic_C.

For our considerations concerning semi-blind channel estimation, we assume knowledge about the subspace defined by range(𝑯)=range(𝑽)range𝑯range𝑽\mathrm{range}({\bm{H}})=\mathrm{range}({\bm{V}})roman_range ( bold_italic_H ) = roman_range ( bold_italic_V ), where we denote with 𝑽𝑽{\bm{V}}bold_italic_V the left singular vectors of 𝑯𝑯{\bm{H}}bold_italic_H, which span the same subspace as the columns of 𝑯𝑯{\bm{H}}bold_italic_H. Generally, we can formulate the ML estimate of 𝑯𝑯{\bm{H}}bold_italic_H in view of (2) as [16]

min𝑯,𝑿n=1J𝒚(n)𝒉n2+n=J+1N𝒚(n)𝑯𝒙(n)2,subscript𝑯𝑿superscriptsubscript𝑛1𝐽superscriptnorm𝒚𝑛subscript𝒉𝑛2superscriptsubscript𝑛𝐽1𝑁superscriptnorm𝒚𝑛𝑯𝒙𝑛2\displaystyle\min_{{\bm{H}},{\bm{X}}}\sum_{n=1}^{J}\left\|{\bm{y}}(n)-{\bm{h}}% _{n}\right\|^{2}+\sum_{n=J+1}^{N}\left\|{\bm{y}}(n)-{\bm{H}}{\bm{x}}(n)\right% \|^{2},roman_min start_POSTSUBSCRIPT bold_italic_H , bold_italic_X end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_n = italic_J + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_H bold_italic_x ( italic_n ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (13)

where the first term belongs to the pilot observations and the second part refers to the observation obtained from the payload symbols. In [16] the EM algorithm is introduced to solve this channel estimation problem in terms of maximum likelihood, whereas, in [11] a semi-blind method is derived based on utilizing the subspace range(𝑯)=range(𝑽)range𝑯range𝑽\mathrm{range}({\bm{H}})=\mathrm{range}({\bm{V}})roman_range ( bold_italic_H ) = roman_range ( bold_italic_V ). Now, given the subspace range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ), we can reformulate (13) as [11]

min𝑺,𝑿n=1J𝒚(n)𝑽𝒔n2+n=J+1N𝒚(n)𝑽𝑺𝒙(n)2,subscript𝑺𝑿superscriptsubscript𝑛1𝐽superscriptnorm𝒚𝑛𝑽subscript𝒔𝑛2superscriptsubscript𝑛𝐽1𝑁superscriptnorm𝒚𝑛𝑽𝑺𝒙𝑛2\displaystyle\min_{{\bm{S}},{\bm{X}}}\sum_{n=1}^{J}\left\|{\bm{y}}(n)-{\bm{V}}% {\bm{s}}_{n}\right\|^{2}+\sum_{n=J+1}^{N}\left\|{\bm{y}}(n)-{\bm{V}}{\bm{S}}{% \bm{x}}(n)\right\|^{2},roman_min start_POSTSUBSCRIPT bold_italic_S , bold_italic_X end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_V bold_italic_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_n = italic_J + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_V bold_italic_S bold_italic_x ( italic_n ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (14)

with 𝑯=𝑽𝑺𝑯𝑽𝑺{\bm{H}}={\bm{V}}{\bm{S}}bold_italic_H = bold_italic_V bold_italic_S and 𝑺=[𝒔1,,𝒔J]J×J𝑺subscript𝒔1subscript𝒔𝐽superscript𝐽𝐽{\bm{S}}=[{\bm{s}}_{1},...,{\bm{s}}_{J}]\in\mathbb{C}^{J\times J}bold_italic_S = [ bold_italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_s start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ] ∈ blackboard_C start_POSTSUPERSCRIPT italic_J × italic_J end_POSTSUPERSCRIPT. The optimal solution for 𝑿𝑿{\bm{X}}bold_italic_X is given as

𝒙(n)superscript𝒙𝑛\displaystyle{\bm{x}}^{*}(n)bold_italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_n ) =(𝑺H𝑽H𝑽𝑺)1𝑺H𝑽H𝒚(n)absentsuperscriptsuperscript𝑺Hsuperscript𝑽H𝑽𝑺1superscript𝑺Hsuperscript𝑽H𝒚𝑛\displaystyle=({\bm{S}}^{\mathrm{H}}{\bm{V}}^{\mathrm{H}}{\bm{V}}{\bm{S}})^{-1% }{\bm{S}}^{\mathrm{H}}{\bm{V}}^{\mathrm{H}}{\bm{y}}(n)= ( bold_italic_S start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V bold_italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_S start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y ( italic_n ) (15)
=(𝑺H𝑺)1𝑺H𝑽H𝒚(n).absentsuperscriptsuperscript𝑺H𝑺1superscript𝑺Hsuperscript𝑽H𝒚𝑛\displaystyle=({\bm{S}}^{\mathrm{H}}{\bm{S}})^{-1}{\bm{S}}^{\mathrm{H}}{\bm{V}% }^{\mathrm{H}}{\bm{y}}(n).= ( bold_italic_S start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_S ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_S start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y ( italic_n ) . (16)

Reinserting this solution into (13), the right part of the objective simplifies to

min𝑺n=1J𝒚(n)𝑽𝒔n2+n=J+1N𝒚(n)𝑽𝑽H𝒚(n)2,subscript𝑺superscriptsubscript𝑛1𝐽superscriptnorm𝒚𝑛𝑽subscript𝒔𝑛2superscriptsubscript𝑛𝐽1𝑁superscriptnorm𝒚𝑛𝑽superscript𝑽H𝒚𝑛2\displaystyle\min_{{\bm{S}}}\sum_{n=1}^{J}\left\|{\bm{y}}(n)-{\bm{V}}{\bm{s}}_% {n}\right\|^{2}+\sum_{n=J+1}^{N}\left\|{\bm{y}}(n)-{\bm{V}}{\bm{V}}^{\mathrm{H% }}{\bm{y}}(n)\right\|^{2},roman_min start_POSTSUBSCRIPT bold_italic_S end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_V bold_italic_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_n = italic_J + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∥ bold_italic_y ( italic_n ) - bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y ( italic_n ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (17)

where only the left part depends on 𝑺𝑺{\bm{S}}bold_italic_S. Thus, the ML problem for the user of interest results in [11]

min𝒔𝒚p𝑽𝒔2,subscript𝒔superscriptnormsubscript𝒚𝑝𝑽𝒔2\displaystyle\min_{\bm{s}}\|{\bm{y}}_{p}-{\bm{V}}{\bm{s}}\|^{2},roman_min start_POSTSUBSCRIPT bold_italic_s end_POSTSUBSCRIPT ∥ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - bold_italic_V bold_italic_s ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (18)

with the closed form solution 𝒉^ML=𝑽𝑽H𝒚psubscript^𝒉ML𝑽superscript𝑽Hsubscript𝒚𝑝\hat{{\bm{h}}}_{\text{ML}}={\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT ML end_POSTSUBSCRIPT = bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. The MSE of this estimator is

MSEML=𝔼[𝒉𝑽𝑽H𝒚p2]=Jσ2.superscriptMSEML𝔼delimited-[]superscriptnorm𝒉𝑽superscript𝑽Hsubscript𝒚𝑝2𝐽superscript𝜎2\displaystyle\mathrm{MSE}^{\mathrm{ML}}=\mathbb{E}\left[\|{\bm{h}}-{\bm{V}}{% \bm{V}}^{\mathrm{H}}{\bm{y}}_{p}\|^{2}\right]=J\sigma^{2}.roman_MSE start_POSTSUPERSCRIPT roman_ML end_POSTSUPERSCRIPT = blackboard_E [ ∥ bold_italic_h - bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_J italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (19)

In the following, we introduce two channel estimation strategies combining the information provided by 𝑪𝑪{\bm{C}}bold_italic_C and range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ).

III-A Subspace Channel Estimator

Using the information in range(𝑽)range𝑽\mathrm{range}({{\bm{V}}})roman_range ( bold_italic_V ), we can solve the estimation within the subspace as previously proposed in [26]. For this, the pilot system model in (4) is projected into the J𝐽Jitalic_J-dimensional subspace as

𝒚=𝑽H𝒚psuperscript𝒚superscript𝑽Hsubscript𝒚𝑝\displaystyle{\bm{y}}^{\prime}={\bm{V}}^{\mathrm{H}}{\bm{y}}_{p}bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT =𝑽H𝒉+𝑽H𝒏=𝒉+𝒏.absentsuperscript𝑽H𝒉superscript𝑽H𝒏superscript𝒉superscript𝒏\displaystyle={\bm{V}}^{\mathrm{H}}{\bm{h}}+{\bm{V}}^{\mathrm{H}}{\bm{n}}={\bm% {h}}^{\prime}+{\bm{n}}^{\prime}.= bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_h + bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_n = bold_italic_h start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + bold_italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . (20)

Under the assumption that 𝑽𝑽{\bm{V}}bold_italic_V is chosen independently of 𝒉𝒉{\bm{h}}bold_italic_h, the distribution 𝒉𝒩(𝟎,𝑽H𝑪𝑽)similar-tosuperscript𝒉subscript𝒩0superscript𝑽H𝑪𝑽{\bm{h}}^{\prime}\sim\mathcal{N}_{\mathbb{C}}(\bm{0},{\bm{V}}^{\mathrm{H}}{\bm% {C}}{\bm{V}})bold_italic_h start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C bold_italic_V ) can be used to formulate the estimate [26]

𝒉^=superscript^𝒉absent\displaystyle\hat{{\bm{h}}}^{\prime}=\;over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 𝑽H𝑪𝑽(𝑽H𝑪𝑽+σ2𝐈J)1𝑽H𝒚p.superscript𝑽H𝑪𝑽superscriptsuperscript𝑽H𝑪𝑽superscript𝜎2subscript𝐈𝐽1superscript𝑽Hsubscript𝒚𝑝\displaystyle{\bm{V}}^{\mathrm{H}}{\bm{C}}{\bm{V}}\left({\bm{V}}^{\mathrm{H}}{% \bm{C}}{\bm{V}}+\sigma^{2}\mathbf{I}_{J}\right)^{-1}{\bm{V}}^{\mathrm{H}}{\bm{% y}}_{p}.bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT . (21)

One should note that by design 𝑽𝑽{\bm{V}}bold_italic_V actually depends on 𝒉𝒉{\bm{h}}bold_italic_h and, hence, (21) is a suboptimal but feasible estimate for 𝒉superscript𝒉{\bm{h}}^{\prime}bold_italic_h start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. After solving the estimation in the subspace for 𝒉superscript𝒉{\bm{h}}^{\prime}bold_italic_h start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, the solution can be transformed back using [26]

𝒉^sub=𝑽𝒉^.subscript^𝒉sub𝑽superscript^𝒉\displaystyle\hat{{\bm{h}}}_{\text{sub}}={\bm{V}}\hat{{\bm{h}}}^{\prime}.over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT = bold_italic_V over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . (22)

III-B Projected Channel Estimator

As an alternative approach, we propose the use of the orthogonal subspace projection 𝑷𝑯=𝑽𝑽Hsubscript𝑷𝑯𝑽superscript𝑽H{\bm{P}}_{\bm{H}}={\bm{V}}{\bm{V}}^{\mathrm{H}}bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT = bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT as a preprocessing filter. Since the projector 𝑷𝑯subscript𝑷𝑯{\bm{P}}_{\bm{H}}bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT does not affect 𝒉𝒉{\bm{h}}bold_italic_h, the resulting projected observation is given by

𝒚~=𝑷𝑯𝒚p=𝒉+𝑷𝑯𝒏=𝒉+𝒏~.~𝒚subscript𝑷𝑯subscript𝒚𝑝𝒉subscript𝑷𝑯𝒏𝒉~𝒏\displaystyle\tilde{{\bm{y}}}={\bm{P}}_{\bm{H}}{\bm{y}}_{p}={\bm{h}}+{\bm{P}}_% {\bm{H}}{\bm{n}}={\bm{h}}+\tilde{{\bm{n}}}.over~ start_ARG bold_italic_y end_ARG = bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = bold_italic_h + bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT bold_italic_n = bold_italic_h + over~ start_ARG bold_italic_n end_ARG . (23)

To formulate the CME 𝒉^=𝔼[𝒉𝒚~]^𝒉𝔼delimited-[]conditional𝒉~𝒚\hat{{\bm{h}}}=\mathbb{E}\left[{\bm{h}}\mid\tilde{{\bm{y}}}\right]over^ start_ARG bold_italic_h end_ARG = blackboard_E [ bold_italic_h ∣ over~ start_ARG bold_italic_y end_ARG ], we need to calculate the statistic of the noise 𝒏~~𝒏\tilde{{\bm{n}}}over~ start_ARG bold_italic_n end_ARG with

𝑪𝒏~=subscript𝑪~𝒏absent\displaystyle{\bm{C}}_{\tilde{{\bm{n}}}}=\;bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT = 𝔼[𝒏~𝒏~H]=𝔼[σ2𝑷𝑯].𝔼delimited-[]~𝒏superscript~𝒏H𝔼delimited-[]superscript𝜎2subscript𝑷𝑯\displaystyle\mathbb{E}\left[\tilde{{\bm{n}}}\tilde{{\bm{n}}}^{\mathrm{H}}% \right]=\mathbb{E}\left[\sigma^{2}{\bm{P}}_{\bm{H}}\right].blackboard_E [ over~ start_ARG bold_italic_n end_ARG over~ start_ARG bold_italic_n end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ] = blackboard_E [ italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT ] . (24)

To get an intuitive understanding of (24), let us consider a scenario involving spatially uncorrelated channels, meaning that path gains and channel directions are uncorrelated. This is the case when users are uniformly distributed over the directions, e.g., the spatial channel model in Section II-A, resulting in a channel covariance matrix of the scenario that is a scaled identity [37, Def. 2.3]. In such a case, the matrices with the eigenvectors of the sample covariance matrix of such channels are distributed with Haar measure [38, Chap. 1], i.e., uniformly distributed on the manifold of unitary matrices. Assuming spatially uncorrelated channels (24) results in

𝑪𝒏~=σ2JM𝐈M,subscript𝑪~𝒏superscript𝜎2𝐽𝑀subscript𝐈𝑀\displaystyle{\bm{C}}_{\tilde{{\bm{n}}}}=\sigma^{2}\frac{J}{M}\mathbf{I}_{M},bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , (25)

which we assume to hold for the remainder of this section. We can then formulate the projected LMMSE estimator as

𝒉^proj=𝑪(𝑪+𝑪𝒏~)1𝒚~.subscript^𝒉proj𝑪superscript𝑪subscript𝑪~𝒏1~𝒚\displaystyle\hat{{\bm{h}}}_{\text{proj}}={\bm{C}}\left({\bm{C}}+{\bm{C}}_{% \tilde{{\bm{n}}}}\right)^{-1}\tilde{{\bm{y}}}.over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj end_POSTSUBSCRIPT = bold_italic_C ( bold_italic_C + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG bold_italic_y end_ARG . (26)

III-C Performance Analysis

If (25) is true, the MSE of the proposed projected channel estimator can directly be written as (cf. Section -A)

MSEprojsuperscriptMSEproj\displaystyle\mathrm{MSE}^{\mathrm{proj}}roman_MSE start_POSTSUPERSCRIPT roman_proj end_POSTSUPERSCRIPT =tr(𝑪𝑪(𝑪+σ2JM𝐈M)1𝑪)absenttr𝑪𝑪superscript𝑪superscript𝜎2𝐽𝑀subscript𝐈𝑀1𝑪\displaystyle=\mathrm{tr}\left({{\bm{C}}}-{{\bm{C}}}\left({{\bm{C}}}+\sigma^{2% }\frac{J}{M}\mathbf{I}_{M}\right)^{-1}{{\bm{C}}}\right)= roman_tr ( bold_italic_C - bold_italic_C ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ) (27)
=i=1Mρiσ2MJρi+σ2.absentsuperscriptsubscript𝑖1𝑀subscript𝜌𝑖superscript𝜎2𝑀𝐽subscript𝜌𝑖superscript𝜎2\displaystyle=\sum_{i=1}^{M}\frac{\rho_{i}\sigma^{2}}{\frac{M}{J}\rho_{i}+% \sigma^{2}}.= ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG divide start_ARG italic_M end_ARG start_ARG italic_J end_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (28)

Comparing the performance to the plain LMMSE we see that

ρiσ2MJρi+σ2ρiσ2ρi+σ2,subscript𝜌𝑖superscript𝜎2𝑀𝐽subscript𝜌𝑖superscript𝜎2subscript𝜌𝑖superscript𝜎2subscript𝜌𝑖superscript𝜎2\displaystyle\frac{\rho_{i}\sigma^{2}}{\frac{M}{J}\rho_{i}+\sigma^{2}}\leq% \frac{\rho_{i}\sigma^{2}}{\rho_{i}+\sigma^{2}},divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG divide start_ARG italic_M end_ARG start_ARG italic_J end_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≤ divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (29)

holds for every i=1,,M𝑖1𝑀i=1,\dots,Mitalic_i = 1 , … , italic_M, resulting in MSEprojMSEplainsuperscriptMSEprojsuperscriptMSEplain\mathrm{MSE}^{\mathrm{proj}}\leq\mathrm{MSE}^{\mathrm{plain}}roman_MSE start_POSTSUPERSCRIPT roman_proj end_POSTSUPERSCRIPT ≤ roman_MSE start_POSTSUPERSCRIPT roman_plain end_POSTSUPERSCRIPT, cf. (12). The inequality in (29) only holds with equality if J=M𝐽𝑀J=Mitalic_J = italic_M. Additionally, we can compare the MSE of the projected LMMSE to (19) by reformulating (28) as

MSEproj=JMσ2i=1Mρiρi+JMσ2Jσ2=MSEML.superscriptMSEproj𝐽𝑀superscript𝜎2superscriptsubscript𝑖1𝑀subscript𝜌𝑖subscript𝜌𝑖𝐽𝑀superscript𝜎2𝐽superscript𝜎2superscriptMSEML\displaystyle\mathrm{MSE}^{\mathrm{proj}}=\frac{J}{M}\sigma^{2}\sum_{i=1}^{M}% \frac{\rho_{i}}{\rho_{i}+\frac{J}{M}\sigma^{2}}\leq J\sigma^{2}=\mathrm{MSE}^{% \text{ML}}.roman_MSE start_POSTSUPERSCRIPT roman_proj end_POSTSUPERSCRIPT = divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≤ italic_J italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = roman_MSE start_POSTSUPERSCRIPT ML end_POSTSUPERSCRIPT . (30)

To compare the projected variant to the subspace LMMSE, let us consider uncorrelated Rayleigh fading such that the channel covariance matrix is given as 𝑪=𝐈M𝑪subscript𝐈𝑀{\bm{C}}=\mathbf{I}_{M}bold_italic_C = bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT. The MSE of the projected variant results in

MSEiidproj=JMσ2M+Jσ2.subscriptsuperscriptMSEprojiid𝐽𝑀superscript𝜎2𝑀𝐽superscript𝜎2\displaystyle\mathrm{MSE}^{\mathrm{proj}}_{\mathrm{iid}}=\frac{JM\sigma^{2}}{{% M}+J\sigma^{2}}.roman_MSE start_POSTSUPERSCRIPT roman_proj end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_iid end_POSTSUBSCRIPT = divide start_ARG italic_J italic_M italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_M + italic_J italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (31)

For the case of 𝑪=𝐈M𝑪subscript𝐈𝑀{\bm{C}}=\mathbf{I}_{M}bold_italic_C = bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, the subspace LMMSE estimator boils down to

𝒉^sub=11+σ2𝑽𝑽H𝒚,subscript^𝒉sub11superscript𝜎2𝑽superscript𝑽H𝒚\displaystyle\hat{{\bm{h}}}_{\mathrm{sub}}=\frac{1}{1+\sigma^{2}}{\bm{V}}{\bm{% V}}^{\mathrm{H}}{\bm{y}},over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y , (32)

with its corresponding MSE as (cf. Section -B)

MSEiidsub=σ2(Mσ2+J)(1+σ2)2MSEiidproj.subscriptsuperscriptMSEsubiidsuperscript𝜎2𝑀superscript𝜎2𝐽superscript1superscript𝜎22subscriptsuperscriptMSEprojiid\displaystyle\mathrm{MSE}^{\mathrm{sub}}_{\mathrm{iid}}=\frac{\sigma^{2}(M% \sigma^{2}+J)}{(1+\sigma^{2})^{2}}\geq\mathrm{MSE}^{\mathrm{proj}}_{\mathrm{% iid}}.roman_MSE start_POSTSUPERSCRIPT roman_sub end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_iid end_POSTSUBSCRIPT = divide start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_M italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J ) end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≥ roman_MSE start_POSTSUPERSCRIPT roman_proj end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_iid end_POSTSUBSCRIPT . (33)

Fig. 1 shows the performance of the individual channel estimators based on perfect genie knowledge for the case of uncorrelated Rayleigh fading. As one can see, the projected channel estimator outperforms all other estimators across the whole signal-to-noise ratio (SNR) range. Further, we realize that the subspace LMMSE converges to the ML method from above.

Refer to caption
Figure 1: MSE over the SNR for given channel estimations based on perfect subspace and perfect statistical knowledge in a J=8𝐽8J=8italic_J = 8 user and M=64𝑀64M=64italic_M = 64 antennas scenario with uncorrelated Rayleigh fading.

IV Proposed Semi-Blind Channel Estimation - Utilizing Generative Prior

In practice, (10) can not be utilized directly as the channels have to be Gaussian distributed, and this distribution needs to be known. In general, the CME given the pilot observation 𝒚psubscript𝒚𝑝{\bm{y}}_{p}bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is formulated as

𝒉^CME=𝔼[𝒉𝒚p]=𝒉p𝒏(𝒚p𝒉)p(𝒉)p(𝒚p)d𝒉.subscript^𝒉CME𝔼delimited-[]conditional𝒉subscript𝒚𝑝𝒉subscript𝑝𝒏subscript𝒚𝑝𝒉𝑝𝒉𝑝subscript𝒚𝑝differential-d𝒉\displaystyle\hat{{\bm{h}}}_{\text{CME}}=\mathbb{E}\left[{\bm{h}}\mid{\bm{y}}_% {p}\right]=\int{\bm{h}}\frac{p_{{\bm{n}}}({\bm{y}}_{p}-{\bm{h}})p({\bm{h}})}{p% ({\bm{y}}_{p})}\mathrm{d}{\bm{h}}.over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT CME end_POSTSUBSCRIPT = blackboard_E [ bold_italic_h ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ] = ∫ bold_italic_h divide start_ARG italic_p start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - bold_italic_h ) italic_p ( bold_italic_h ) end_ARG start_ARG italic_p ( bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_ARG roman_d bold_italic_h . (34)

As can be seen in (34), the CME generally can not be computed analytically. First, the CME needs access to p(𝒉)𝑝𝒉p({\bm{h}})italic_p ( bold_italic_h ), which is generally unavailable in practice. Additionally, no closed-form solution exists to the integral in (34).

In order to reformulate the CME, we first use the property that for any arbitrarily distributed random variable 𝒉𝒉{\bm{h}}bold_italic_h, we can always find a condition 𝒄𝒄{\bm{c}}bold_italic_c which makes the conditional distribution Gaussian. Secondly, it has been shown in [39] that for wireless communication channels, this conditional Gaussian distribution preserves the zero-mean property as

𝒉𝜹𝒩(𝟎,𝑪𝒉𝒄).similar-toconditional𝒉𝜹subscript𝒩0subscript𝑪conditional𝒉𝒄\displaystyle{\bm{h}}\mid{\bm{\delta}}\sim\mathcal{N}_{\mathbb{C}}(\bm{0},{\bm% {C}}_{{\bm{h}}\mid{\bm{c}}}).bold_italic_h ∣ bold_italic_δ ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_0 , bold_italic_C start_POSTSUBSCRIPT bold_italic_h ∣ bold_italic_c end_POSTSUBSCRIPT ) . (35)

Thus, we can reformulate the CME as

𝔼[𝒉𝒚p]𝔼delimited-[]conditional𝒉subscript𝒚𝑝\displaystyle\mathbb{E}\left[{\bm{h}}\mid{\bm{y}}_{p}\right]blackboard_E [ bold_italic_h ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ] =𝔼[𝔼[𝒉𝒚p,𝒄]𝒚p]absent𝔼delimited-[]conditional𝔼delimited-[]conditional𝒉subscript𝒚𝑝𝒄subscript𝒚𝑝\displaystyle=\mathbb{E}\left[\mathbb{E}\left[{\bm{h}}\mid{\bm{y}}_{p},{\bm{c}% }\right]\mid{\bm{y}}_{p}\right]= blackboard_E [ blackboard_E [ bold_italic_h ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , bold_italic_c ] ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ] (36)
=𝔼[𝒉𝒚p,𝒄]p(𝒄𝒚p)d𝒄absent𝔼delimited-[]conditional𝒉subscript𝒚𝑝𝒄𝑝conditional𝒄subscript𝒚𝑝differential-d𝒄\displaystyle=\int\mathbb{E}\left[{\bm{h}}\mid{\bm{y}}_{p},{\bm{c}}\right]p({% \bm{c}}\mid{\bm{y}}_{p})\mathrm{d}{\bm{c}}= ∫ blackboard_E [ bold_italic_h ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , bold_italic_c ] italic_p ( bold_italic_c ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) roman_d bold_italic_c (37)
𝒉^𝒄(𝒚p)p(𝒄𝒚p)d𝒄absentsubscript^𝒉𝒄subscript𝒚𝑝𝑝conditional𝒄subscript𝒚𝑝differential-d𝒄\displaystyle\approx\int\hat{{\bm{h}}}_{\bm{c}}({\bm{y}}_{p})p({\bm{c}}\mid{% \bm{y}}_{p})\mathrm{d}{\bm{c}}≈ ∫ over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT bold_italic_c end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) italic_p ( bold_italic_c ∣ bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) roman_d bold_italic_c (38)

where

𝒉^𝒄(𝒚p)=𝑪𝒉𝒄(𝑪𝒉𝒄+𝑪𝒏)1𝒚p,subscript^𝒉𝒄subscript𝒚𝑝subscript𝑪conditional𝒉𝒄superscriptsubscript𝑪conditional𝒉𝒄subscript𝑪𝒏1subscript𝒚𝑝\displaystyle\hat{{\bm{h}}}_{\bm{c}}({\bm{y}}_{p})={\bm{C}}_{{\bm{h}}\mid{\bm{% c}}}\left({\bm{C}}_{{\bm{h}}\mid{\bm{c}}}+{\bm{C}}_{\bm{n}}\right)^{-1}{\bm{y}% }_{p},over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT bold_italic_c end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) = bold_italic_C start_POSTSUBSCRIPT bold_italic_h ∣ bold_italic_c end_POSTSUBSCRIPT ( bold_italic_C start_POSTSUBSCRIPT bold_italic_h ∣ bold_italic_c end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , (39)

denotes the LMMSE estimate given 𝒄𝒄{\bm{c}}bold_italic_c. However, finding a suitable condition 𝜹𝜹\bm{\delta}bold_italic_δ can be challenging, in particular as the true distribution of 𝒉𝒉{\bm{h}}bold_italic_h is unknown. To this end, CGLMs were proposed in [30, 31, 32], which approximate the CME based on a GMM, mixtures of factor analyzers (MFA), and VAE, respectively. All three methods learn a model that provides the conditional Gaussian distribution 𝒉𝒄conditional𝒉𝒄{\bm{h}}\mid{\bm{c}}bold_italic_h ∣ bold_italic_c based on a discrete (GMM, MFA) or continuous (VAE) latent variable 𝒄𝒄{\bm{c}}bold_italic_c. In this work, we focus on the GMM and VAE, which we adapt to semi-blind channel estimation in the following.

IV-A GMM-based Semi-blind Channel Estimation

Based on the universal approximation property of GMMs [40], the PDF of 𝒉𝒉{\bm{h}}bold_italic_h is approximated by

f𝒉(K)(𝒉)=k=1Kp(k)𝒩(𝒉;𝝁k,𝑪k),superscriptsubscript𝑓𝒉𝐾𝒉superscriptsubscript𝑘1𝐾𝑝𝑘subscript𝒩𝒉subscript𝝁𝑘subscript𝑪𝑘\displaystyle f_{\bm{h}}^{(K)}({\bm{h}})=\sum_{k=1}^{K}p(k)\mathcal{N}_{% \mathbb{C}}({\bm{h}};\bm{\mu}_{k},{\bm{C}}_{k}),italic_f start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_K ) end_POSTSUPERSCRIPT ( bold_italic_h ) = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_k ) caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_h ; bold_italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (40)

where p(k)𝑝𝑘p(k)italic_p ( italic_k ), 𝝁ksubscript𝝁𝑘\bm{\mu}_{k}bold_italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and 𝑪ksubscript𝑪𝑘{\bm{C}}_{k}bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are the mixing coefficients, means and covariance matrices of the k𝑘kitalic_k-th GMM component, respectively. As we are considering wireless channels, the mean of each component is set to 𝝁k=𝟎subscript𝝁𝑘0\bm{\mu}_{k}=\bm{0}bold_italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = bold_0, cf. [39]. The fitting of the components in (40) is accomplished with the well-known EM algorithm [41] based on a set ={𝒉t}t=1Tsubscriptsuperscriptsubscript𝒉𝑡𝑇𝑡1\mathcal{H}=\{{\bm{h}}_{t}\}^{T}_{t=1}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT of T𝑇Titalic_T channel samples as training data. Based on the formulation in (40) the conditional PDF given k𝑘kitalic_k is

𝒉k𝒩(𝒉;𝟎,𝑪k).similar-toconditional𝒉𝑘subscript𝒩𝒉0subscript𝑪𝑘\displaystyle{\bm{h}}\mid k\sim\mathcal{N}_{\mathbb{C}}({\bm{h}};\bm{0},{\bm{C% }}_{k}).bold_italic_h ∣ italic_k ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_h ; bold_0 , bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) . (41)

Thus, in the case of a GMM we have a discrete latent variable, which helps us parameterize the CME. The resulting semi-blind subspace GMM can be formulated as

𝒉^sub. GMM=𝑽𝒉^GMM=𝑽k=1Kp(k𝒚)𝒉^GMM,k,subscript^𝒉sub. GMM𝑽subscriptsuperscript^𝒉GMM𝑽superscriptsubscript𝑘1𝐾𝑝conditional𝑘superscript𝒚subscriptsuperscript^𝒉GMM𝑘\displaystyle\hat{{\bm{h}}}_{\text{sub. GMM}}={\bm{V}}\hat{{\bm{h}}}^{\prime}_% {\text{GMM}}={\bm{V}}\sum_{k=1}^{K}p(k\mid{\bm{y}}^{\prime})\hat{{\bm{h}}}^{% \prime}_{\text{GMM},k},over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT sub. GMM end_POSTSUBSCRIPT = bold_italic_V over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT GMM end_POSTSUBSCRIPT = bold_italic_V ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_k ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT GMM , italic_k end_POSTSUBSCRIPT , (42)

with

𝒉^GMM,k=subscriptsuperscript^𝒉GMM𝑘absent\displaystyle\hat{{\bm{h}}}^{\prime}_{\text{GMM},k}=\;over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT GMM , italic_k end_POSTSUBSCRIPT = 𝑽H𝑪k𝑽(𝑽H𝑪k𝑽+σ2𝐈J)1𝑽H𝒚p,superscript𝑽Hsubscript𝑪𝑘𝑽superscriptsuperscript𝑽Hsubscript𝑪𝑘𝑽superscript𝜎2subscript𝐈𝐽1superscript𝑽Hsubscript𝒚𝑝\displaystyle{\bm{V}}^{\mathrm{H}}{\bm{C}}_{k}{\bm{V}}\left({\bm{V}}^{\mathrm{% H}}{\bm{C}}_{k}{\bm{V}}+\sigma^{2}\mathbf{I}_{J}\right)^{-1}{\bm{V}}^{\mathrm{% H}}{\bm{y}}_{p},bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , (43)

and the corresponding responsibilities

p(k𝒚)=p(k)𝒩(𝒚;𝟎,𝑽H𝑪k𝑽+σ2𝐈J)i=1Kp(i)𝒩(𝒚;𝟎,𝑽H𝑪i𝑽+σ2𝐈J).𝑝conditional𝑘superscript𝒚𝑝𝑘subscript𝒩superscript𝒚0superscript𝑽Hsubscript𝑪𝑘𝑽superscript𝜎2subscript𝐈𝐽superscriptsubscript𝑖1𝐾𝑝𝑖subscript𝒩superscript𝒚0superscript𝑽Hsubscript𝑪𝑖𝑽superscript𝜎2subscript𝐈𝐽\displaystyle p(k\mid{\bm{y}}^{\prime})=\frac{p(k)\mathcal{N}_{\mathbb{C}}% \left({\bm{y}}^{\prime};\bm{0},{\bm{V}}^{\mathrm{H}}{\bm{C}}_{k}{\bm{V}}+% \sigma^{2}\mathbf{I}_{J}\right)}{\sum_{i=1}^{K}p(i)\mathcal{N}_{\mathbb{C}}% \left({\bm{y}}^{\prime};\bm{0},{\bm{V}}^{\mathrm{H}}{\bm{C}}_{i}{\bm{V}}+% \sigma^{2}\mathbf{I}_{J}\right)}.italic_p ( italic_k ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = divide start_ARG italic_p ( italic_k ) caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ; bold_0 , bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_i ) caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ; bold_0 , bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) end_ARG . (44)

The projected GMM is

𝒉^proj. GMM=k=1Kp(k𝒚~)𝒉^proj. GMM,k,subscript^𝒉proj. GMMsuperscriptsubscript𝑘1𝐾𝑝conditional𝑘~𝒚subscript^𝒉proj. GMM𝑘\displaystyle\hat{{\bm{h}}}_{\text{proj. GMM}}=\sum_{k=1}^{K}p(k\mid\tilde{{% \bm{y}}})\hat{{\bm{h}}}_{\text{proj. GMM},k},over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj. GMM end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_k ∣ over~ start_ARG bold_italic_y end_ARG ) over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj. GMM , italic_k end_POSTSUBSCRIPT , (45)

with

𝒉^proj. GMM,k=𝑪k(𝑪k+𝑪𝒏~)1𝒚~subscript^𝒉proj. GMM𝑘subscript𝑪𝑘superscriptsubscript𝑪𝑘subscript𝑪~𝒏1~𝒚\displaystyle\hat{{\bm{h}}}_{\text{proj. GMM},k}={\bm{C}}_{k}\left({\bm{C}}_{k% }+{\bm{C}}_{\tilde{{\bm{n}}}}\right)^{-1}\tilde{{\bm{y}}}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj. GMM , italic_k end_POSTSUBSCRIPT = bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG bold_italic_y end_ARG (46)

and the associated responsibilities

p(k𝒚~)=p(k)𝒩(𝒚~;𝟎,𝑪k+𝑪𝒏~)i=1Kp(i)𝒩(𝒚~;𝟎,𝑪i+𝑪𝒏~).𝑝conditional𝑘~𝒚𝑝𝑘subscript𝒩~𝒚0subscript𝑪𝑘subscript𝑪~𝒏superscriptsubscript𝑖1𝐾𝑝𝑖subscript𝒩~𝒚0subscript𝑪𝑖subscript𝑪~𝒏\displaystyle p(k\mid\tilde{{\bm{y}}})=\frac{p(k)\mathcal{N}_{\mathbb{C}}\left% (\tilde{{\bm{y}}};\bm{0},{\bm{C}}_{k}+{\bm{C}}_{\tilde{{\bm{n}}}}\right)}{\sum% _{i=1}^{K}p(i)\mathcal{N}_{\mathbb{C}}\left(\tilde{{\bm{y}}};\bm{0},{\bm{C}}_{% i}+{\bm{C}}_{\tilde{{\bm{n}}}}\right)}.italic_p ( italic_k ∣ over~ start_ARG bold_italic_y end_ARG ) = divide start_ARG italic_p ( italic_k ) caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_y end_ARG ; bold_0 , bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_i ) caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_y end_ARG ; bold_0 , bold_italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) end_ARG . (47)

The respective estimators are summarized in Algorithm 1 and Algorithm 2.

Algorithm 1 Subspace GMM Channel Estimator

Offline Training Phase

1:Training dataset ={𝒉t}t=1Tsuperscriptsubscriptsubscript𝒉𝑡𝑡1𝑇\mathcal{H}=\{{\bm{h}}_{t}\}_{t=1}^{T}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
2:Fit the GMM with the EM algorithm, cf. [30]  
3:
4:𝒀=[𝒚(1),,𝒚(N)]𝒀𝒚1𝒚𝑁{\bm{Y}}=[{\bm{y}}(1),\dots,{\bm{y}}(N)]bold_italic_Y = [ bold_italic_y ( 1 ) , … , bold_italic_y ( italic_N ) ], 𝑷𝑷{\bm{P}}bold_italic_P, σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
5:𝑪^𝒚𝑯1N𝒀𝒀Hsubscript^𝑪conditional𝒚𝑯1𝑁𝒀superscript𝒀H\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}\leftarrow\frac{1}{N}{\bm{Y}}{\bm{Y}}^{% \mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT ← divide start_ARG 1 end_ARG start_ARG italic_N end_ARG bold_italic_Y bold_italic_Y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
6:𝑽^J^𝑽𝐽\hat{{\bm{V}}}\leftarrow Jover^ start_ARG bold_italic_V end_ARG ← italic_J dominant eigenvectors of 𝑪^𝒚𝑯subscript^𝑪conditional𝒚𝑯\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT
7:𝒀p=[𝒚p,1,,𝒚p,J]𝒀p𝑷Hsubscript𝒀𝑝subscript𝒚𝑝1subscript𝒚𝑝𝐽subscriptsuperscript𝒀𝑝superscript𝑷H{\bm{Y}}_{p}=[{\bm{y}}_{p,1},\dots,{\bm{y}}_{p,J}]\leftarrow{\bm{Y}}^{\prime}_% {p}{\bm{P}}^{\mathrm{H}}bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = [ bold_italic_y start_POSTSUBSCRIPT italic_p , 1 end_POSTSUBSCRIPT , … , bold_italic_y start_POSTSUBSCRIPT italic_p , italic_J end_POSTSUBSCRIPT ] ← bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
8:for j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J do
9:     𝒚𝑽H𝒚p,jsuperscript𝒚superscript𝑽Hsubscript𝒚𝑝𝑗{\bm{y}}^{\prime}\leftarrow{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p,j}bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ← bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT
10:     for k=1,,K𝑘1𝐾k=1,\dots,Kitalic_k = 1 , … , italic_K do
11:         𝒉^k𝑽H𝑪k𝑽(𝑽H𝑪k𝑽+σ2𝐈J)1𝒚subscriptsuperscript^𝒉𝑘superscript𝑽Hsubscript𝑪𝑘𝑽superscriptsuperscript𝑽Hsubscript𝑪𝑘𝑽superscript𝜎2subscript𝐈𝐽1superscript𝒚\hat{{\bm{h}}}^{\prime}_{k}\leftarrow{\bm{V}}^{\mathrm{H}}{\bm{C}}_{k}{\bm{V}}% \left({\bm{V}}^{\mathrm{H}}{\bm{C}}_{k}{\bm{V}}+\sigma^{2}\mathbf{I}_{J}\right% )^{-1}{\bm{y}}^{\prime}over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
12:     end for
13:     𝒉^j𝑽k=1Kp(k𝒚)𝒉^ksubscript^𝒉𝑗𝑽superscriptsubscript𝑘1𝐾𝑝conditional𝑘superscript𝒚subscriptsuperscript^𝒉𝑘\hat{{\bm{h}}}_{j}\leftarrow{\bm{V}}\sum_{k=1}^{K}p(k\mid{\bm{y}}^{\prime})% \hat{{\bm{h}}}^{\prime}_{k}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ← bold_italic_V ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_k ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
14:end for
15:return 𝒉^j,j=1,,Jformulae-sequencesubscript^𝒉𝑗for-all𝑗1𝐽\hat{{\bm{h}}}_{j},\forall j=1,\dots,Jover^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j = 1 , … , italic_J
Algorithm 2 Projected GMM Channel Estimator

Offline Training Phase

1:Training dataset ={𝒉t}t=1Tsuperscriptsubscriptsubscript𝒉𝑡𝑡1𝑇\mathcal{H}=\{{\bm{h}}_{t}\}_{t=1}^{T}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
2:Fit the GMM with the EM algorithm, cf. [30]  
3:
4:𝒀=[𝒚(1),,𝒚(N)]𝒀𝒚1𝒚𝑁{\bm{Y}}=[{\bm{y}}(1),\dots,{\bm{y}}(N)]bold_italic_Y = [ bold_italic_y ( 1 ) , … , bold_italic_y ( italic_N ) ], 𝑷𝑷{\bm{P}}bold_italic_P, σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
5:𝑪^𝒚𝑯1N𝒀𝒀Hsubscript^𝑪conditional𝒚𝑯1𝑁𝒀superscript𝒀H\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}\leftarrow\frac{1}{N}{\bm{Y}}{\bm{Y}}^{% \mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT ← divide start_ARG 1 end_ARG start_ARG italic_N end_ARG bold_italic_Y bold_italic_Y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
6:𝑽^J^𝑽𝐽\hat{{\bm{V}}}\leftarrow Jover^ start_ARG bold_italic_V end_ARG ← italic_J dominant eigenvectors of 𝑪^𝒚𝑯subscript^𝑪conditional𝒚𝑯\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT
7:𝒀p=[𝒚p,1,,𝒚p,J]𝒀p𝑷Hsubscript𝒀𝑝subscript𝒚𝑝1subscript𝒚𝑝𝐽subscriptsuperscript𝒀𝑝superscript𝑷H{\bm{Y}}_{p}=[{\bm{y}}_{p,1},\dots,{\bm{y}}_{p,J}]\leftarrow{\bm{Y}}^{\prime}_% {p}{\bm{P}}^{\mathrm{H}}bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = [ bold_italic_y start_POSTSUBSCRIPT italic_p , 1 end_POSTSUBSCRIPT , … , bold_italic_y start_POSTSUBSCRIPT italic_p , italic_J end_POSTSUBSCRIPT ] ← bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
8:for j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J do
9:     𝒚~𝑽𝑽H𝒚p,j~𝒚𝑽superscript𝑽Hsubscript𝒚𝑝𝑗\tilde{{\bm{y}}}\leftarrow{\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p,j}over~ start_ARG bold_italic_y end_ARG ← bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT
10:     for k=1,,K𝑘1𝐾k=1,\dots,Kitalic_k = 1 , … , italic_K do
11:         𝒉^k𝑪k(𝑪k+𝑪𝒏~)1𝒚~subscript^𝒉𝑘subscript𝑪𝑘superscriptsubscript𝑪𝑘subscript𝑪~𝒏1~𝒚\hat{{\bm{h}}}_{k}\leftarrow{\bm{C}}_{k}\left({\bm{C}}_{k}+{\bm{C}}_{\tilde{{% \bm{n}}}}\right)^{-1}\tilde{{\bm{y}}}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG bold_italic_y end_ARG
12:     end for
13:     𝒉^jk=1Kp(k𝒚~)𝒉^ksubscript^𝒉𝑗superscriptsubscript𝑘1𝐾𝑝conditional𝑘~𝒚subscript^𝒉𝑘\hat{{\bm{h}}}_{j}\leftarrow\sum_{k=1}^{K}p(k\mid\tilde{{\bm{y}}})\hat{{\bm{h}% }}_{k}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ← ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_p ( italic_k ∣ over~ start_ARG bold_italic_y end_ARG ) over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
14:end for
15:return 𝒉^j,j=1,,Jformulae-sequencesubscript^𝒉𝑗for-all𝑗1𝐽\hat{{\bm{h}}}_{j},\forall j=1,\dots,Jover^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j = 1 , … , italic_J
Refer to caption
Figure 2: Structure of a semi-blind VAE. The matrix 𝑽^^𝑽\hat{{\bm{V}}}over^ start_ARG bold_italic_V end_ARG contains the J𝐽Jitalic_J dominant eigenvectors of (57). The encoder and decoder represent DNNs.

IV-B VAE-based Semi-blind Channel Estimation

To learn the unknown distribution f𝒉(𝒉)subscript𝑓𝒉𝒉f_{\bm{h}}({\bm{h}})italic_f start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT ( bold_italic_h ) using a VAE, we lower bound the parameterized likelihood p𝜽(𝒉)subscript𝑝𝜽𝒉p_{\bm{\theta}}({\bm{h}})italic_p start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_h ) using the evidence-lower bound (ELBO). To formulate the ELBO, the variational distributions qϕ(𝒛𝒚)subscript𝑞bold-italic-ϕconditional𝒛superscript𝒚q_{\bm{\phi}}({\bm{z}}\mid{\bm{y}}^{\prime})italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and qϕ(𝒛𝒚~)subscript𝑞bold-italic-ϕconditional𝒛~𝒚q_{\bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}}})italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) are introduced, which approximate p(𝒛𝒚)𝑝conditional𝒛superscript𝒚p({\bm{z}}\mid{\bm{y}}^{\prime})italic_p ( bold_italic_z ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and p(𝒛𝒚~)𝑝conditional𝒛~𝒚p({\bm{z}}\mid\tilde{{\bm{y}}})italic_p ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ), respectively. In contrast to the GMM, the used subspace range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ) is unknown to the encoder of the VAE making p(𝒛𝒚)𝑝conditional𝒛superscript𝒚p({\bm{z}}\mid{\bm{y}}^{\prime})italic_p ( bold_italic_z ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) difficult to learn. Additionally, the dimension of the encoder input would depend on the number of users in the system. Thus, we propose to approximate both posteriors p(𝒛𝒚)𝑝conditional𝒛superscript𝒚p({\bm{z}}\mid{\bm{y}}^{\prime})italic_p ( bold_italic_z ∣ bold_italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and p(𝒛𝒚~)𝑝conditional𝒛~𝒚p({\bm{z}}\mid\tilde{{\bm{y}}})italic_p ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) with qϕ(𝒛𝒚~)subscript𝑞bold-italic-ϕconditional𝒛~𝒚q_{\bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}}})italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ). A version of the ELBO for this case, which is accessible, can be written as [42]

𝜽,ϕ=𝔼qϕ[logp𝜽(𝒉𝒛)]DKL(qϕ(𝒛𝒚~)p(𝒛)),\displaystyle\mathcal{L}_{{\bm{\theta}},{\bm{\phi}}}=\mathbb{E}_{q_{\bm{\phi}}% }[\log p_{\bm{\theta}}({\bm{h}}\mid{\bm{z}})]-\mathrm{D}_{\mathrm{KL}}(q_{\bm{% \phi}}({\bm{z}}\mid\tilde{{\bm{y}}})\mid\mid p({\bm{z}})),caligraphic_L start_POSTSUBSCRIPT bold_italic_θ , bold_italic_ϕ end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_log italic_p start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_h ∣ bold_italic_z ) ] - roman_D start_POSTSUBSCRIPT roman_KL end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) ∣ ∣ italic_p ( bold_italic_z ) ) , (48)

where 𝔼qϕ[]=𝔼qϕ(𝒛𝒚~)[]subscript𝔼subscript𝑞bold-italic-ϕdelimited-[]subscript𝔼subscript𝑞bold-italic-ϕconditional𝒛~𝒚delimited-[]\mathbb{E}_{q_{\bm{\phi}}}[\cdot]=\mathbb{E}_{q_{\bm{\phi}}({\bm{z}}\mid\tilde% {{\bm{y}}})}[\cdot]blackboard_E start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ⋅ ] = blackboard_E start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) end_POSTSUBSCRIPT [ ⋅ ] is the expectation over the variational distribution qϕ(𝒛𝒚~)subscript𝑞bold-italic-ϕconditional𝒛~𝒚q_{\bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}}})italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ). The second term in (48) is the Kullback-Leibler (KL) divergence

DKL(qϕ(𝒛𝒚~)p(𝒛))=𝔼qϕ[log(qϕ(𝒛𝒚~)p(𝒛))].\displaystyle\mathrm{D}_{\mathrm{KL}}(q_{\bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}% }})\mid\mid p({\bm{z}}))=\mathbb{E}_{q_{\bm{\phi}}}\left[\log\left(\frac{q_{% \bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}}})}{p({\bm{z}})}\right)\right].roman_D start_POSTSUBSCRIPT roman_KL end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) ∣ ∣ italic_p ( bold_italic_z ) ) = blackboard_E start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_log ( divide start_ARG italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) end_ARG start_ARG italic_p ( bold_italic_z ) end_ARG ) ] . (49)

In the VAE framework, the ELBO is optimized using deep neural networks (DNNs) and the reparameterization trick [42]. In order to do so, the involved distributions are defined as

p(𝒛)𝑝𝒛\displaystyle p({\bm{z}})italic_p ( bold_italic_z ) =𝒩(𝟎,𝐈Z),absent𝒩0subscript𝐈𝑍\displaystyle=\mathcal{N}(\bm{0},\mathbf{I}_{Z}),= caligraphic_N ( bold_0 , bold_I start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ) ,
p𝜽(𝒉𝒛)subscript𝑝𝜽conditional𝒉𝒛\displaystyle p_{\bm{\theta}}({\bm{h}}\mid{\bm{z}})italic_p start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_h ∣ bold_italic_z ) =𝒩(𝝁𝜽(𝒛),𝑪𝜽(𝒛)),absentsubscript𝒩subscript𝝁𝜽𝒛subscript𝑪𝜽𝒛\displaystyle=\mathcal{N}_{\mathbb{C}}(\bm{\mu}_{\bm{\theta}}({\bm{z}}),{{\bm{% C}}}_{\bm{\theta}}({\bm{z}})),= caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) , bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ) , (50)
qϕ(𝒛𝒚~)subscript𝑞bold-italic-ϕconditional𝒛~𝒚\displaystyle q_{\bm{\phi}}({\bm{z}}\mid\tilde{{\bm{y}}})italic_q start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) =𝒩(𝝁ϕ(𝒚~),diag(𝝈ϕ2(𝒚~))).absent𝒩subscript𝝁bold-italic-ϕ~𝒚diagsubscriptsuperscript𝝈2bold-italic-ϕ~𝒚\displaystyle=\mathcal{N}(\bm{\mu}_{\bm{\phi}}(\tilde{{\bm{y}}}),\mathrm{diag}% (\bm{\sigma}^{2}_{\bm{\phi}}(\tilde{{\bm{y}}}))).= caligraphic_N ( bold_italic_μ start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_y end_ARG ) , roman_diag ( bold_italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_y end_ARG ) ) ) .

The resulting semi-blind VAE structure is shown in Fig. 2. In the case of a ULA or URA at the BS the channel covariance matrix is either Toeplitz or block-Toeplitz, respectively. As shown in [39], the conditional covariance matrix at the output of the VAE preserves this structure. Thus, we parameterize the output covariance matrix as

𝑪𝜽(𝒛)=𝑸Hdiag(𝒄𝜽(𝒛))𝑸,subscript𝑪𝜽𝒛superscript𝑸Hdiagsubscript𝒄𝜽𝒛𝑸\displaystyle{{\bm{C}}}_{\bm{\theta}}({\bm{z}})={\bm{Q}}^{\mathrm{H}}\mathrm{% diag}({\bm{c}}_{\bm{\theta}}({\bm{z}})){\bm{Q}},bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) = bold_italic_Q start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT roman_diag ( bold_italic_c start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ) bold_italic_Q , (51)

where 𝑸=𝑸M𝑸subscript𝑸𝑀{\bm{Q}}={\bm{Q}}_{M}bold_italic_Q = bold_italic_Q start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT or 𝑸=𝑸Nv𝑸Nh𝑸tensor-productsubscriptsuperscript𝑸subscript𝑁𝑣subscriptsuperscript𝑸subscript𝑁{\bm{Q}}={\bm{Q}}^{\prime}_{N_{v}}\otimes{\bm{Q}}^{\prime}_{N_{h}}bold_italic_Q = bold_italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ bold_italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT end_POSTSUBSCRIPT, respectively, where 𝑸Msubscript𝑸𝑀{\bm{Q}}_{M}bold_italic_Q start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT is a DFT matrix of size M𝑀Mitalic_M resulting in a circulant approximation, cf. [32], and 𝑸Nxsubscriptsuperscript𝑸subscript𝑁𝑥{\bm{Q}}^{\prime}_{N_{x}}bold_italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_POSTSUBSCRIPT contains the first Nxsubscript𝑁𝑥N_{x}italic_N start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT columns of the 2Nx×2Nx2subscript𝑁𝑥2subscript𝑁𝑥2N_{x}\times 2N_{x}2 italic_N start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT × 2 italic_N start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT DFT matrix resulting in a block-Toeplitz parameterization, cf. [43]. Further, for the (block-)Toeplitz parameterization we can set 𝝁𝜽(𝒛)=𝟎subscript𝝁𝜽𝒛0\bm{\mu}_{\bm{\theta}}({\bm{z}})=\bm{0}bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) = bold_0, cf. [39].

After successfully training the VAE, the output is a local parameterization of f𝒉(𝒉)subscript𝑓𝒉𝒉f_{\bm{h}}({\bm{h}})italic_f start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT ( bold_italic_h ) as conditionally Gaussian

𝒉𝒛p𝜽(𝒉𝒛).similar-toconditional𝒉𝒛subscript𝑝𝜽conditional𝒉𝒛\displaystyle{\bm{h}}\mid{\bm{z}}\sim p_{\bm{\theta}}({\bm{h}}\mid{\bm{z}}).bold_italic_h ∣ bold_italic_z ∼ italic_p start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_h ∣ bold_italic_z ) . (52)

As analyzed in [32] it is a reasonable approximation to set

p(𝒛𝒚~)={1if 𝒛=𝝁ϕ(𝒚~),0otherwise.𝑝conditional𝒛~𝒚cases1if 𝒛subscript𝝁bold-italic-ϕ~𝒚otherwise0otherwiseotherwise\displaystyle p({\bm{z}}\mid\tilde{{\bm{y}}})=\begin{cases}1\quad\text{if }{% \bm{z}}=\bm{\mu}_{\bm{\phi}}(\tilde{{\bm{y}}}),\\ 0\quad\text{otherwise}.\end{cases}italic_p ( bold_italic_z ∣ over~ start_ARG bold_italic_y end_ARG ) = { start_ROW start_CELL 1 if bold_italic_z = bold_italic_μ start_POSTSUBSCRIPT bold_italic_ϕ end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_y end_ARG ) , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 0 otherwise . end_CELL start_CELL end_CELL end_ROW (53)

Based on this parameterization we can formulate the semi-blind VAE-based estimators as

𝒉^proj. VAE=subscript^𝒉proj. VAEabsent\displaystyle\hat{{\bm{h}}}_{\text{proj. VAE}}=\;over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj. VAE end_POSTSUBSCRIPT = 𝝁𝜽(𝒛)+𝑪𝜽(𝒛)(𝑪𝜽(𝒛)+𝑪𝒏~)1(𝒚~𝝁𝜽(𝒛)),subscript𝝁𝜽𝒛subscript𝑪𝜽𝒛superscriptsubscript𝑪𝜽𝒛subscript𝑪~𝒏1~𝒚subscript𝝁𝜽𝒛\displaystyle\bm{\mu}_{\bm{\theta}}({\bm{z}})+{\bm{C}}_{\bm{\theta}}({\bm{z}})% \left({\bm{C}}_{\bm{\theta}}({\bm{z}})+{\bm{C}}_{\tilde{{\bm{n}}}}\right)^{-1}% (\tilde{{\bm{y}}}-\bm{\mu}_{\bm{\theta}}({\bm{z}})),bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) + bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ( bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_y end_ARG - bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ) , (54)

and

𝒉^sub. VAE=subscript^𝒉sub. VAEabsent\displaystyle\hat{{\bm{h}}}_{\text{sub. VAE}}=\;over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT sub. VAE end_POSTSUBSCRIPT = 𝑽𝑽H𝑪𝜽(𝒛)𝑽(𝑽H𝑪𝜽(𝒛)𝑽+σ2𝐈J)1𝑽superscript𝑽Hsubscript𝑪𝜽𝒛𝑽superscriptsuperscript𝑽Hsubscript𝑪𝜽𝒛𝑽superscript𝜎2subscript𝐈𝐽1\displaystyle{\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{C}}_{\bm{\theta}}({\bm{z}}){\bm% {V}}\left({\bm{V}}^{\mathrm{H}}{\bm{C}}_{\bm{\theta}}({\bm{z}}){\bm{V}}+\sigma% ^{2}\mathbf{I}_{J}\right)^{-1}bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT
×(𝑽H𝒚p𝑽H𝝁𝜽(𝒛))𝑽𝑽H𝝁𝜽(𝒛).absentsuperscript𝑽Hsubscript𝒚𝑝superscript𝑽Hsubscript𝝁𝜽𝒛𝑽superscript𝑽Hsubscript𝝁𝜽𝒛\displaystyle\times({\bm{V}}^{\mathrm{H}}{\bm{y}}_{p}-{\bm{V}}^{\mathrm{H}}\bm% {\mu}_{\bm{\theta}}({\bm{z}}))-{\bm{V}}{\bm{V}}^{\mathrm{H}}\bm{\mu}_{\bm{% \theta}}({\bm{z}}).× ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ) - bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) . (55)

The respective estimators are summarized in Algorithm 3 and Algorithm 4. For a more detailed introduction into the VAE framework and its usage for parameterization of the CME we refer the reader to [32].

Algorithm 3 Subspace VAE Channel Estimator

Offline Training Phase

1:Training dataset ={𝒉t}t=1Tsuperscriptsubscriptsubscript𝒉𝑡𝑡1𝑇\mathcal{H}=\{{\bm{h}}_{t}\}_{t=1}^{T}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
2:Fit the VAE by optimizing the ELBO, cf. [32]  
3:
4:𝒀=[𝒚(1),,𝒚(N)]𝒀𝒚1𝒚𝑁{\bm{Y}}=[{\bm{y}}(1),\dots,{\bm{y}}(N)]bold_italic_Y = [ bold_italic_y ( 1 ) , … , bold_italic_y ( italic_N ) ], 𝑷𝑷{\bm{P}}bold_italic_P, σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
5:𝑪^𝒚𝑯1N𝒀𝒀Hsubscript^𝑪conditional𝒚𝑯1𝑁𝒀superscript𝒀H\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}\leftarrow\frac{1}{N}{\bm{Y}}{\bm{Y}}^{% \mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT ← divide start_ARG 1 end_ARG start_ARG italic_N end_ARG bold_italic_Y bold_italic_Y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
6:𝑽^J^𝑽𝐽\hat{{\bm{V}}}\leftarrow Jover^ start_ARG bold_italic_V end_ARG ← italic_J dominant eigenvectors of 𝑪^𝒚𝑯subscript^𝑪conditional𝒚𝑯\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT
7:𝒀p=[𝒚p,1,,𝒚p,J]𝒀p𝑷Hsubscript𝒀𝑝subscript𝒚𝑝1subscript𝒚𝑝𝐽subscriptsuperscript𝒀𝑝superscript𝑷H{\bm{Y}}_{p}=[{\bm{y}}_{p,1},\dots,{\bm{y}}_{p,J}]\leftarrow{\bm{Y}}^{\prime}_% {p}{\bm{P}}^{\mathrm{H}}bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = [ bold_italic_y start_POSTSUBSCRIPT italic_p , 1 end_POSTSUBSCRIPT , … , bold_italic_y start_POSTSUBSCRIPT italic_p , italic_J end_POSTSUBSCRIPT ] ← bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
8:for j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J do
9:     𝒚~𝑽𝑽H𝒚p,j~𝒚𝑽superscript𝑽Hsubscript𝒚𝑝𝑗\tilde{{\bm{y}}}\leftarrow{\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p,j}over~ start_ARG bold_italic_y end_ARG ← bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT
10:     𝝁𝜽(𝒛),𝑪𝜽(𝒛)VAE(𝒚~)subscript𝝁𝜽𝒛subscript𝑪𝜽𝒛VAE~𝒚\bm{\mu}_{\bm{\theta}}({\bm{z}}),{\bm{C}}_{\bm{\theta}}({\bm{z}})\leftarrow% \mathrm{VAE}(\tilde{{\bm{y}}})bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) , bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ← roman_VAE ( over~ start_ARG bold_italic_y end_ARG )
11:     𝒉^j𝑽𝑽H𝑪𝜽(𝒛)𝑽(𝑽H𝑪𝜽(𝒛)𝑽+σ2𝐈J)1subscript^𝒉𝑗𝑽superscript𝑽Hsubscript𝑪𝜽𝒛𝑽superscriptsuperscript𝑽Hsubscript𝑪𝜽𝒛𝑽superscript𝜎2subscript𝐈𝐽1\hat{{\bm{h}}}_{j}\leftarrow{\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{C}}_{\bm{\theta}% }({\bm{z}}){\bm{V}}\left({\bm{V}}^{\mathrm{H}}{\bm{C}}_{\bm{\theta}}({\bm{z}})% {\bm{V}}+\sigma^{2}\mathbf{I}_{J}\right)^{-1}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ← bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT
12:           ×(𝑽H𝒚p𝑽H𝝁𝜽(𝒛))𝑽𝑽H𝝁𝜽(𝒛)absentsuperscript𝑽Hsubscript𝒚𝑝superscript𝑽Hsubscript𝝁𝜽𝒛𝑽superscript𝑽Hsubscript𝝁𝜽𝒛\times({\bm{V}}^{\mathrm{H}}{\bm{y}}_{p}-{\bm{V}}^{\mathrm{H}}\bm{\mu}_{\bm{% \theta}}({\bm{z}}))-{\bm{V}}{\bm{V}}^{\mathrm{H}}\bm{\mu}_{\bm{\theta}}({\bm{z% }})× ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ) - bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z )
13:end for
14:return 𝒉^j,j=1,,Jformulae-sequencesubscript^𝒉𝑗for-all𝑗1𝐽\hat{{\bm{h}}}_{j},\forall j=1,\dots,Jover^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j = 1 , … , italic_J
Algorithm 4 Projected VAE Channel Estimator

Offline Training Phase

1:Training dataset ={𝒉t}t=1Tsuperscriptsubscriptsubscript𝒉𝑡𝑡1𝑇\mathcal{H}=\{{\bm{h}}_{t}\}_{t=1}^{T}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
2:Fit the VAE by optimizing the ELBO, cf. [32]  
3:
4:𝒀=[𝒚(1),,𝒚(N)]𝒀𝒚1𝒚𝑁{\bm{Y}}=[{\bm{y}}(1),\dots,{\bm{y}}(N)]bold_italic_Y = [ bold_italic_y ( 1 ) , … , bold_italic_y ( italic_N ) ], 𝑷𝑷{\bm{P}}bold_italic_P, σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
5:𝑪^𝒚𝑯1N𝒀𝒀Hsubscript^𝑪conditional𝒚𝑯1𝑁𝒀superscript𝒀H\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}\leftarrow\frac{1}{N}{\bm{Y}}{\bm{Y}}^{% \mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT ← divide start_ARG 1 end_ARG start_ARG italic_N end_ARG bold_italic_Y bold_italic_Y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
6:𝑽^J^𝑽𝐽\hat{{\bm{V}}}\leftarrow Jover^ start_ARG bold_italic_V end_ARG ← italic_J dominant eigenvectors of 𝑪^𝒚𝑯subscript^𝑪conditional𝒚𝑯\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT
7:𝒀p=[𝒚p,1,,𝒚p,J]𝒀p𝑷Hsubscript𝒀𝑝subscript𝒚𝑝1subscript𝒚𝑝𝐽subscriptsuperscript𝒀𝑝superscript𝑷H{\bm{Y}}_{p}=[{\bm{y}}_{p,1},\dots,{\bm{y}}_{p,J}]\leftarrow{\bm{Y}}^{\prime}_% {p}{\bm{P}}^{\mathrm{H}}bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = [ bold_italic_y start_POSTSUBSCRIPT italic_p , 1 end_POSTSUBSCRIPT , … , bold_italic_y start_POSTSUBSCRIPT italic_p , italic_J end_POSTSUBSCRIPT ] ← bold_italic_Y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
8:for j=1,,J𝑗1𝐽j=1,\dots,Jitalic_j = 1 , … , italic_J do
9:     𝒚~𝑽𝑽H𝒚p,j~𝒚𝑽superscript𝑽Hsubscript𝒚𝑝𝑗\tilde{{\bm{y}}}\leftarrow{\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p,j}over~ start_ARG bold_italic_y end_ARG ← bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT
10:     𝝁𝜽(𝒛),𝑪𝜽(𝒛)VAE(𝒚~)subscript𝝁𝜽𝒛subscript𝑪𝜽𝒛VAE~𝒚\bm{\mu}_{\bm{\theta}}({\bm{z}}),{\bm{C}}_{\bm{\theta}}({\bm{z}})\leftarrow% \mathrm{VAE}(\tilde{{\bm{y}}})bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) , bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ← roman_VAE ( over~ start_ARG bold_italic_y end_ARG )
11:     𝒉^j𝝁𝜽(𝒛)+𝑪𝜽(𝒛)(𝑪𝜽(𝒛)+𝑪𝒏~)1(𝒚~𝝁𝜽(𝒛))subscript^𝒉𝑗subscript𝝁𝜽𝒛subscript𝑪𝜽𝒛superscriptsubscript𝑪𝜽𝒛subscript𝑪~𝒏1~𝒚subscript𝝁𝜽𝒛\hat{{\bm{h}}}_{j}\leftarrow\bm{\mu}_{\bm{\theta}}({\bm{z}})+{\bm{C}}_{\bm{% \theta}}({\bm{z}})\left({\bm{C}}_{\bm{\theta}}({\bm{z}})+{\bm{C}}_{\tilde{{\bm% {n}}}}\right)^{-1}(\tilde{{\bm{y}}}-\bm{\mu}_{\bm{\theta}}({\bm{z}}))over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ← bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) + bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) ( bold_italic_C start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_y end_ARG - bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) )
12:           ×(𝒚~𝝁𝜽(𝒛))absent~𝒚subscript𝝁𝜽𝒛\times(\tilde{{\bm{y}}}-\bm{\mu}_{\bm{\theta}}({\bm{z}}))× ( over~ start_ARG bold_italic_y end_ARG - bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ) )
13:end for
14:return 𝒉^j,j=1,,Jformulae-sequencesubscript^𝒉𝑗for-all𝑗1𝐽\hat{{\bm{h}}}_{j},\forall j=1,\dots,Jover^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j = 1 , … , italic_J

IV-C Maximum Likelihood Subspace Estimation

After introducing the methods utilizing the additional subspace information provided by range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ) to enhance the CSI estimation quality, let us consider the estimation of such a subspace. As the received data symbols are also transmitted over the same channel, they can be used to estimate the subspace containing all user channels.

To this end, let us reconsider the ML estimate of 𝑯𝑯{\bm{H}}bold_italic_H in (13). Instead of directly optimizing on this ML formulation as done in [16], which generally does not result in the MMSE, we only take this log-likelihood formulation as an intermediate step to estimate the subspace range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ). First, let us consider the right term of the objective function in (13). We can then reformulate the problem by again solving for 𝑿𝑿{\bm{X}}bold_italic_X first and reinserting the solution resulting in [11]

max𝑯tr(𝑷𝑯𝑪^𝒚𝑯(d)),subscript𝑯trsubscript𝑷𝑯subscriptsuperscript^𝑪𝑑conditional𝒚𝑯\displaystyle\max_{\bm{H}}\;\mathrm{tr}\left({\bm{P}}_{\bm{H}}\hat{{\bm{C}}}^{% (d)}_{{\bm{y}}\mid{\bm{H}}}\right),roman_max start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT roman_tr ( bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT over^ start_ARG bold_italic_C end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT ) , (56)

where 𝑷𝑯=𝑯(𝑯H𝑯)1𝑯H=𝑽𝑽Hsubscript𝑷𝑯𝑯superscriptsuperscript𝑯H𝑯1superscript𝑯H𝑽superscript𝑽H{\bm{P}}_{\bm{H}}={\bm{H}}({\bm{H}}^{\mathrm{H}}{\bm{H}})^{-1}{\bm{H}}^{% \mathrm{H}}={\bm{V}}{\bm{V}}^{\mathrm{H}}bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT = bold_italic_H ( bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_H ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT = bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT and 𝑪^𝒚𝑯(d)=1NJ𝒀d𝒀dHsubscriptsuperscript^𝑪𝑑conditional𝒚𝑯1𝑁𝐽subscript𝒀𝑑superscriptsubscript𝒀𝑑H\hat{{\bm{C}}}^{(d)}_{{\bm{y}}\mid{\bm{H}}}=\frac{1}{N-J}{\bm{Y}}_{d}{\bm{Y}}_% {d}^{\mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_N - italic_J end_ARG bold_italic_Y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT bold_italic_Y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT, with 𝒀dsubscript𝒀𝑑{\bm{Y}}_{d}bold_italic_Y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT from (2). The maximization in (56) is solved by setting 𝑷𝑯subscript𝑷𝑯{\bm{P}}_{\bm{H}}bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT equal to 𝑽^d𝑽^dHsubscript^𝑽𝑑superscriptsubscript^𝑽𝑑H\hat{{\bm{V}}}_{d}\hat{{\bm{V}}}_{d}^{\mathrm{H}}over^ start_ARG bold_italic_V end_ARG start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT over^ start_ARG bold_italic_V end_ARG start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT with 𝑽^dsubscript^𝑽𝑑\hat{{\bm{V}}}_{d}over^ start_ARG bold_italic_V end_ARG start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT holding the J𝐽Jitalic_J dominant eigenvectors of the receive sample covariance matrix 𝑪^𝒚𝑯(d)subscriptsuperscript^𝑪𝑑conditional𝒚𝑯\hat{{\bm{C}}}^{(d)}_{{\bm{y}}\mid{\bm{H}}}over^ start_ARG bold_italic_C end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT. This result has also been used in [26]. Additionally, it is trivial to see that the first term in (13) is minimized by 𝒉n=𝒚(n)subscript𝒉𝑛𝒚𝑛{\bm{h}}_{n}={\bm{y}}(n)bold_italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_italic_y ( italic_n ). The subspace spanned by the solution 𝒉n=𝒚(n)subscript𝒉𝑛𝒚𝑛{\bm{h}}_{n}={\bm{y}}(n)bold_italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_italic_y ( italic_n ) is the same as the subspace spanned by the J𝐽Jitalic_J eigenvectors of the sample covariance matrix 𝑪^𝒚𝑯(p)=1J𝒀p𝒀pHsubscriptsuperscript^𝑪𝑝conditional𝒚𝑯1𝐽subscript𝒀𝑝superscriptsubscript𝒀𝑝H\hat{{\bm{C}}}^{(p)}_{{\bm{y}}\mid{\bm{H}}}=\frac{1}{J}{\bm{Y}}_{p}{\bm{Y}}_{p% }^{\mathrm{H}}over^ start_ARG bold_italic_C end_ARG start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_J end_ARG bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_italic_Y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT, which ignores the additional phase information contained in the pilot observation. Thus, the overall subspace estimate 𝑽^=[𝒗1,,𝒗J]^𝑽subscript𝒗1subscript𝒗𝐽\hat{{\bm{V}}}=[{\bm{v}}_{1},\dots,{\bm{v}}_{J}]over^ start_ARG bold_italic_V end_ARG = [ bold_italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_v start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ] is found by taking the J𝐽Jitalic_J dominant eigenvectors 𝒗jsubscript𝒗𝑗{\bm{v}}_{j}bold_italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT of the sample covariance matrix defined as

𝑪^𝒚𝑯=1N𝒀𝒀H.subscript^𝑪conditional𝒚𝑯1𝑁𝒀superscript𝒀H\displaystyle\hat{{\bm{C}}}_{{\bm{y}}\mid{\bm{H}}}=\frac{1}{N}{\bm{Y}}{\bm{Y}}% ^{\mathrm{H}}.over^ start_ARG bold_italic_C end_ARG start_POSTSUBSCRIPT bold_italic_y ∣ bold_italic_H end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG bold_italic_Y bold_italic_Y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT . (57)

To utilize information from the previous coherence intervals, one can adaptively update the subspace using efficient tracking algorithms as proposed in, e.g., [44, 45].

IV-D Complexity Analysis

The standalone GMM estimator proposed by [30] precomputes the filters used for the individual components, resulting in a complexity of 𝒪(KM2)𝒪𝐾superscript𝑀2\mathcal{O}(KM^{2})caligraphic_O ( italic_K italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). For the standalone VAE the complexity is given as 𝒪(DM2)𝒪𝐷superscript𝑀2\mathcal{O}(DM^{2})caligraphic_O ( italic_D italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) [32], where D𝐷Ditalic_D denotes the number of layers in the forward pass of the VAE. For our semi-blind methods, the calculation of the subspace requires 𝒪((N+J)M2)𝒪𝑁𝐽superscript𝑀2\mathcal{O}((N+J)M^{2})caligraphic_O ( ( italic_N + italic_J ) italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). This results from calculating the sample covariance matrix with 𝒪(NM2)𝒪𝑁superscript𝑀2\mathcal{O}(NM^{2})caligraphic_O ( italic_N italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and taking the eigenvectors of the J𝐽Jitalic_J largest eigenvalues for the solution of (56). Using the projection approximation subspace tracking (PAST) algorithm [45], the computational complexity of calculating the subspace reduces to 𝒪(JM)𝒪𝐽𝑀\mathcal{O}(JM)caligraphic_O ( italic_J italic_M ) for every update. In the case of the subspace GMM the K𝐾Kitalic_K LMMSE estimates can not be precomputed, which results in a complexity of 𝒪(K(M2+JM2+J3))𝒪𝐾superscript𝑀2𝐽superscript𝑀2superscript𝐽3\mathcal{O}(K(M^{2}+JM^{2}+J^{3}))caligraphic_O ( italic_K ( italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) ). Similarily, the subspace VAE exhibits a complexity of 𝒪(DM2+JM2+J3)𝒪𝐷superscript𝑀2𝐽superscript𝑀2superscript𝐽3\mathcal{O}(DM^{2}+JM^{2}+J^{3})caligraphic_O ( italic_D italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ). For the projected versions of the GMM and VAE the complexity becomes 𝒪(KM2+JM2)𝒪𝐾superscript𝑀2𝐽superscript𝑀2\mathcal{O}(KM^{2}+JM^{2})caligraphic_O ( italic_K italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and 𝒪(DM2+JM2)𝒪𝐷superscript𝑀2𝐽superscript𝑀2\mathcal{O}(DM^{2}+JM^{2})caligraphic_O ( italic_D italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), respectively. One should note that the calculation for each of the K𝐾Kitalic_K components in the GMM can be parallelized. Similarly, the computations in the convolutional layers of the VAE can be parallelized, mitigating the complexity.

V Baseline estimators

To compare our methods, the following baseline channel estimators are considered. Based on the found subspace range(𝑽)range𝑽\mathrm{range}({\bm{V}})roman_range ( bold_italic_V ) we can formulate the pilot-based ML estimator as 𝒉^ML=𝑽𝑽H𝒚psubscript^𝒉ML𝑽superscript𝑽Hsubscript𝒚𝑝\hat{{\bm{h}}}_{\text{ML}}={\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT ML end_POSTSUBSCRIPT = bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, which is the closed-form solution to (18). This can be interpreted as the subspace-adjusted version of the conventional least squares (LS) channel estimator given as 𝒉^LS=𝒚psubscript^𝒉LSsubscript𝒚𝑝\hat{{\bm{h}}}_{\text{LS}}={\bm{y}}_{p}over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT LS end_POSTSUBSCRIPT = bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT.

Another estimator is based on the sample covariance matrix, which we can compute from the training data set {\mathcal{H}}caligraphic_H to infer the global statistics of the channels as

𝑪s=1||𝒉𝒉𝒉H.subscript𝑪𝑠1subscript𝒉𝒉superscript𝒉H\displaystyle{\bm{C}}_{s}=\frac{1}{|{\mathcal{H}}|}\sum_{{{\bm{h}}}\in{% \mathcal{H}}}{{\bm{h}}}{{\bm{h}}}^{\mathrm{H}}.bold_italic_C start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG | caligraphic_H | end_ARG ∑ start_POSTSUBSCRIPT bold_italic_h ∈ caligraphic_H end_POSTSUBSCRIPT bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT . (58)

We can use the matrix 𝑪𝑪{\bm{C}}bold_italic_C as statistical prior to parameterize the semi-blind channel estimators outlined in Section III-A and Section III-B as

𝒉^sub. s-cov=𝑽𝑽H𝑪s𝑽(𝑽H𝑪s𝑽+σ2𝐈J)1𝑽H𝒚p,subscript^𝒉sub. s-cov𝑽superscript𝑽Hsubscript𝑪𝑠𝑽superscriptsuperscript𝑽Hsubscript𝑪𝑠𝑽superscript𝜎2subscript𝐈𝐽1superscript𝑽Hsubscript𝒚𝑝\displaystyle\hat{{\bm{h}}}_{\text{sub. s-cov}}={\bm{V}}{\bm{V}}^{\mathrm{H}}{% \bm{C}}_{s}{\bm{V}}\left({\bm{V}}^{\mathrm{H}}{\bm{C}}_{s}{\bm{V}}+\sigma^{2}% \mathbf{I}_{J}\right)^{-1}{\bm{V}}^{\mathrm{H}}{\bm{y}}_{p},over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT sub. s-cov end_POSTSUBSCRIPT = bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT bold_italic_V ( bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_C start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT bold_italic_V + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , (59)

and

𝒉^proj. s-cov=𝑪s(𝑪s+𝑪𝒏~)1𝑷𝑯𝒚p.subscript^𝒉proj. s-covsubscript𝑪𝑠superscriptsubscript𝑪𝑠subscript𝑪~𝒏1subscript𝑷𝑯subscript𝒚𝑝\displaystyle\hat{{\bm{h}}}_{\text{proj. s-cov}}={\bm{C}}_{s}\left({\bm{C}}_{s% }+{\bm{C}}_{\tilde{{\bm{n}}}}\right)^{-1}{\bm{P}}_{\bm{H}}{\bm{y}}_{p}.over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT proj. s-cov end_POSTSUBSCRIPT = bold_italic_C start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( bold_italic_C start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT + bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_P start_POSTSUBSCRIPT bold_italic_H end_POSTSUBSCRIPT bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT . (60)

Lastly, we compare our proposed methods to two iterative algorithms optimizing the ML formulation in (13), namely the EM from [16] and a MP variant similar to [17], which we run both until convergence or 500500500500 iteration, whatever comes first.

VI Numerical Simulations

Refer to caption
Refer to caption
(a) Spatial Channel Model (Sec. II-A)
Refer to caption
(b) Measurement Data (Sec. II-B)
Figure 3: NMSE over the SNR for given channel estimations based on N=200𝑁200N=200italic_N = 200 observations including one pilot per user in a J=8𝐽8J=8italic_J = 8 user scenario using (a) the spatial channel model and (b) measurement data.

To evaluate our proposed methods, we use channel realizations, which are normalized with 𝔼[𝒉2]=M𝔼delimited-[]superscriptnorm𝒉2𝑀\mathbb{E}\left[\|{\bm{h}}\|^{2}\right]=Mblackboard_E [ ∥ bold_italic_h ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_M. Thus, we can define the SNR =1σ2absent1superscript𝜎2=\frac{1}{\sigma^{2}}= divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG. Further, the normalized MSE (NMSE) defined as

NMSE=1ML=1L𝒉𝒉^2,NMSE1𝑀𝐿superscriptsubscript1𝐿superscriptnormsubscript𝒉subscript^𝒉2\displaystyle\text{NMSE}=\frac{1}{ML}\sum_{\ell=1}^{L}\|{\bm{h}}_{\ell}-\hat{{% \bm{h}}}_{\ell}\|^{2},NMSE = divide start_ARG 1 end_ARG start_ARG italic_M italic_L end_ARG ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ∥ bold_italic_h start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - over^ start_ARG bold_italic_h end_ARG start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (61)

is used to characterize the performance of the estimators based on L=103𝐿superscript103L=10^{3}italic_L = 10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT unseen channel samples stemming from the channel models detailed in Sections II-A and II-B. The assumption of spatial uncorrelated channels, as used in (25), only holds for the spatial channel model of Section II-A. For the case of the measurement campaign described in Section II-B, we approximate the noise covariance matrix (24) as

𝑪𝒏~σ2JM𝐈M.subscript𝑪~𝒏superscript𝜎2𝐽𝑀subscript𝐈𝑀\displaystyle{\bm{C}}_{\tilde{{\bm{n}}}}\approx\sigma^{2}\frac{J}{M}\mathbf{I}% _{M}.bold_italic_C start_POSTSUBSCRIPT over~ start_ARG bold_italic_n end_ARG end_POSTSUBSCRIPT ≈ italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT . (62)

We use ={𝒉t}t=1Tsubscriptsuperscriptsubscript𝒉𝑡𝑇𝑡1\mathcal{H}=\{{\bm{h}}_{t}\}^{T}_{t=1}caligraphic_H = { bold_italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT with T=1.5105𝑇1.5superscript105T=1.5\cdot 10^{5}italic_T = 1.5 ⋅ 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT training samples from the respective channel model to train the GMM and VAE, where we set the number of components to K=64𝐾64K=64italic_K = 64 and the latent dimension to Z=32𝑍32Z=32italic_Z = 32, respectively. Further, in the case of the VAE we allow non-zero values for 𝝁𝜽(𝒛)subscript𝝁𝜽𝒛\bm{\mu}_{\bm{\theta}}({\bm{z}})bold_italic_μ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_z ), as we use the circulant approximation for the spatial channel model and the block-Toeplitz property is not perfectly fulfilled for the measurement data, due to hardware imperfections. The “s-cov” variants (“sub. s-cov” and “proj. s-cov”) utilize the same training samples. The number of BS antenna is set to M=64𝑀64M=64italic_M = 64, cf. Sections II-A and II-B, serving J=8=M/8𝐽8𝑀8J=8=M/8italic_J = 8 = italic_M / 8 number of users, a representative operating point [37, Chap. 1.3.3]. Further, the number of snapshots is set to N=200𝑁200N=200italic_N = 200, if not stated otherwise, corresponding to a scenario that allows high channel dispersion and high mobility, e.g., up to 135135135135 kph, c.f. [37, Chap. 2.1]. The sent symbols during data transmission generally stem from a discrete constellation, e.g., QPSK, 16161616-QAM. For this work, we utilize Gaussian symbols with xj(n)𝒩(0,Pj=1/J)similar-tosubscript𝑥𝑗𝑛subscript𝒩0subscript𝑃𝑗1𝐽x_{j}(n)\sim\mathcal{N}_{\mathbb{C}}(0,P_{j}=1/J)italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_n ) ∼ caligraphic_N start_POSTSUBSCRIPT blackboard_C end_POSTSUBSCRIPT ( 0 , italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1 / italic_J ) such that j=1JPj=1superscriptsubscript𝑗1𝐽subscript𝑃𝑗1\sum_{j=1}^{J}P_{j}=1∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1. Using a continuous symbol constellation has a negligible effect on the results of the simulations, as also previously observed in [16].

Fig. 3a and Fig. 3b show the performance of the different channel estimation methods with respect to the SNR for the spatial channel model (cf. Section II-A) and measurement data (cf. Section II-B), respectively. One can see that the semi-blind methods utilizing the CGLMs perform the best across the whole SNR. The projected variant slightly outperforms its subspace counterpart for most SNR values, which follows the derivations in Section III. Interestingly, the order of the semi-blind GMM and semi-blind VAE depends on the utilized channel model. In Fig. 3a, the projected GMM and VAE show both the best overall result, whereas in Fig. 3b, the projected GMM outperforms all other estimators. Additionally, we see that in Fig. 3a, the subspace VAE outperforms the subspace GMM and in Fig. 3b, the results are vice versa. This ordering follows the ordering of the standalone version, where the plain GMM is better than the plain VAE in the case of the measurement data and worse for the spatial channel model. For high SNR values, all semi-blind variants approach each other except the EM and MP methods. In Fig. 3a, the EM and MP drastically improve from 15151515dB to 20202020dB showing similar performance at 20202020dB as the other semi-blind methods, whereas, in Fig. 3b, they show inferior results also for high SNR. The CGLM-based approaches keep a slight advantage even for high SNR values, which can be attributed to the fact that prior information is beneficial even for high SNR. A notable observation is that, in the mid-SNR range, the semi-blind CGLM variants outperform all related estimators by roughly 3333 dB.

Refer to caption
Refer to caption
(a) Spatial Channel Model (Sec. II-A)
Refer to caption
(b) Measurement Data (Sec. II-B)
Figure 4: NMSE over the number of observations for given channel estimations based on N𝑁Nitalic_N observations including one pilot per user at SNR =0absent0=0= 0 dB in a J=8𝐽8J=8italic_J = 8 user scenario using (a) the spatial channel model and (b) measurement data.

For our proposed strategies, the accuracy of the estimated subspace range(𝑽^)range^𝑽\mathrm{range}(\hat{{\bm{V}}})roman_range ( over^ start_ARG bold_italic_V end_ARG ) influences the performance and, hence, the NMSE depends on the number of snapshots N𝑁Nitalic_N as shown in Fig. 4. We see that for an increasing number of snapshots, the NMSE of our proposed methods decreases. In the case of the spatial channel model (Fig. 4a) the projected GMM and VAE perform best for low numbers of snapshots, where the standalone VAE surpasses all other methods for N=20𝑁20N=20italic_N = 20. Additionally, we observe in Fig. 4a that for high N𝑁Nitalic_N, the subspace VAE becomes the best of all considered methods. In Fig. 4b, we observe again that for the measurement data, the semi-blind GMM variants perform the best, where for high N𝑁Nitalic_N the subspace GMM and low N𝑁Nitalic_N the projected GMM outperforms all other methods. Again, for less than 30303030 snapshots, the semi-blind methods are outperformed by the standalone GMM and VAE due to inaccuracies in estimating the subspace with a low number of payload data symbols. For both utilized channel models, the subspace variant of the superior CGLM surpasses the projected variant for high numbers of snapshots converging to a lower error level. Thus, in practice, where, in general, uncorrelated Rayleigh fading is not the case, there are cases where the subspace CGLM outperforms its projected counterpart.

A critical decrease in performance can be observed for the EM and MP algorithms. Here, the NMSE increases after a certain point when increasing the number of snapshots. Even though the minimum appears at different N𝑁Nitalic_N, the overall behavior exhibits similarities. This is because both methods optimize the joint ML formulation in (13), where the optimization of the second term becomes dominant for a high number of snapshots. Hence, the impact of the pilot observations relevant to estimating the phase of the channel vanishes.

Refer to caption
Refer to caption
(a) Spatial Channel Model (Sec. II-A)
Refer to caption
(b) Measurement Data (Sec. II-B)
Figure 5: NMSE over the number of users for given channel estimations based on N=200𝑁200N=200italic_N = 200 observations including one pilot per user at SNR =0absent0=0= 0 dB using (a) the spatial channel model and (b) measurement data.

The dimension of the subspace range(𝑽^)range^𝑽\mathrm{range}(\hat{{\bm{V}}})roman_range ( over^ start_ARG bold_italic_V end_ARG ) directly influences the estimation quality of the proposed methods as shown in Fig. 5a and Fig. 5b. For example, in the extreme case where the number of users in the system is equal to the number of BS antennas (J=M𝐽𝑀J=Mitalic_J = italic_M), the solution to (56) becomes 𝑽^𝑽^H=𝐈^𝑽superscript^𝑽H𝐈\hat{{\bm{V}}}\hat{{\bm{V}}}^{\mathrm{H}}=\mathbf{I}over^ start_ARG bold_italic_V end_ARG over^ start_ARG bold_italic_V end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT = bold_I and, hence, as the number of users in the system increases all semi-blind estimators approach their respective purely pilot based version. We restrict our simulations within the interval of JM/4=16𝐽𝑀416J\leq M/4=16italic_J ≤ italic_M / 4 = 16, which is said to be the preferred operating regime in massive MIMO [37, Chap. 1.3.3], and set the number of snapshots to N=200𝑁200N=200italic_N = 200. In the case of a single user, all semi-blind variants exhibit similar performance, except for the subspace sample covariance estimator, the subspace VAE, and in the case of the spatial channel model (Fig. 5a) the subspace GMM. For all other considered numbers of users, the proposed projected CGLM methods outperform all other channel estimators. Additionally, for the spatial channel model in Fig. 5a, the subspace GMM also shows inferior results to the other CGLM-based methods for all numbers of users.

Overall, we can conclude that the proposed semi-blind CGLMs show superior channel estimation performance across all different setups. Depending on the used channel model, either the semi-blind GMMs or the semi-blind VAEs result in slightly better NMSE, where only in the case of the spatial channel model, the subspace GMM shows slightly worse performance compared to the other proposed methods. Moreover, the projected CGLMs outperform their respective subspace counterpart for most simulated operating points, showing the superiority of the proposed projection method.

VII Conclusion

This work presented a novel semi-blind channel estimation technique based on the class of CGLMs. To this end, two methods are discussed that incorporate subspace knowledge about the channel into the well-known LMMSE estimator. Both methods exploit the estimated subspace derived from the dominant eigenvectors of sample covariance matrices constructed using the received symbols. A theoretical analysis of the methods showed the superior estimation quality of the proposed projection-based estimator for uncorrelated Rayleigh fading channels. Secondly, we showed how two examples from the class of CGLMs, i.e., the GMM and VAE, can be used to parameterize these estimators. Extensive simulations based on real-world measurement and spatial channel model data demonstrated the superior estimation performance of the proposed methods compared to standard semi-blind channel estimators.

-A MSE of Projected LMMSE

For any linear estimator 𝒉^=𝑾𝒚p^𝒉𝑾subscript𝒚𝑝\hat{{\bm{h}}}={\bm{W}}{\bm{y}}_{p}over^ start_ARG bold_italic_h end_ARG = bold_italic_W bold_italic_y start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, the MSE is given as

MSEMSE\displaystyle\mathrm{MSE}roman_MSE =𝔼[𝒉𝒉^2]=𝔼[tr((𝒉𝒉^)(𝒉H𝒉^H))]absent𝔼delimited-[]superscriptnorm𝒉^𝒉2𝔼delimited-[]tr𝒉^𝒉superscript𝒉Hsuperscript^𝒉H\displaystyle=\mathbb{E}\left[\|{\bm{h}}-\hat{{\bm{h}}}\|^{2}\right]=\mathbb{E% }\left[\mathrm{tr}\left(\left({\bm{h}}-\hat{{\bm{h}}}\right)\left({\bm{h}}^{% \mathrm{H}}-\hat{{\bm{h}}}^{\mathrm{H}}\right)\right)\right]= blackboard_E [ ∥ bold_italic_h - over^ start_ARG bold_italic_h end_ARG ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = blackboard_E [ roman_tr ( ( bold_italic_h - over^ start_ARG bold_italic_h end_ARG ) ( bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT - over^ start_ARG bold_italic_h end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ) ] (63)
=𝔼[tr(𝒉𝒉H)2tr(𝒉𝒚H𝑾)+tr(𝑾𝒚𝒚H𝑾H)].absent𝔼delimited-[]tr𝒉superscript𝒉H2tr𝒉superscript𝒚H𝑾tr𝑾𝒚superscript𝒚Hsuperscript𝑾H\displaystyle=\mathbb{E}\left[\mathrm{tr}({\bm{h}}{\bm{h}}^{\mathrm{H}})-2% \mathrm{tr}({\bm{h}}{\bm{y}}^{\mathrm{H}}{\bm{W}})+\mathrm{tr}({\bm{W}}{\bm{y}% }{\bm{y}}^{\mathrm{H}}{\bm{W}}^{\mathrm{H}})\right].= blackboard_E [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) - 2 roman_t roman_r ( bold_italic_h bold_italic_y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W ) + roman_tr ( bold_italic_W bold_italic_y bold_italic_y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] . (64)

For the case of the projected LMMSE the second term in (64) can be rewritten as

𝔼[tr(𝒉𝒚~H𝑾H)]𝔼delimited-[]tr𝒉superscript~𝒚Hsuperscript𝑾H\displaystyle\mathbb{E}\left[\mathrm{tr}\left({\bm{h}}\tilde{{\bm{y}}}^{% \mathrm{H}}{\bm{W}}^{\mathrm{H}}\right)\right]blackboard_E [ roman_tr ( bold_italic_h over~ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (65)
=𝔼[tr(𝒉𝒉H𝑾H)]absent𝔼delimited-[]tr𝒉superscript𝒉Hsuperscript𝑾H\displaystyle\quad\quad=\mathbb{E}\left[\mathrm{tr}\left({\bm{h}}{\bm{h}}^{% \mathrm{H}}{\bm{W}}^{\mathrm{H}}\right)\right]= blackboard_E [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (66)
=𝔼[tr(𝒉𝒉H(𝑪+σ2JM𝐈M)1𝑪)]absent𝔼delimited-[]tr𝒉superscript𝒉Hsuperscript𝑪superscript𝜎2𝐽𝑀subscript𝐈𝑀1𝑪\displaystyle\quad\quad=\mathbb{E}\left[\mathrm{tr}\left({\bm{h}}{\bm{h}}^{% \mathrm{H}}\left({\bm{C}}+\sigma^{2}\frac{J}{M}\mathbf{I}_{M}\right)^{-1}{\bm{% C}}\right)\right]= blackboard_E [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ) ] (67)
=tr(𝑪(𝑪+σ2JM𝐈M)1𝑪).absenttr𝑪superscript𝑪superscript𝜎2𝐽𝑀subscript𝐈𝑀1𝑪\displaystyle\quad\quad=\mathrm{tr}\left({\bm{C}}\left({\bm{C}}+\sigma^{2}% \frac{J}{M}\mathbf{I}_{M}\right)^{-1}{\bm{C}}\right).= roman_tr ( bold_italic_C ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ) . (68)

Similarly for the third term in (64) we have

𝔼[tr(𝑾𝒚~𝒚~H𝑾H)]𝔼delimited-[]tr𝑾~𝒚superscript~𝒚Hsuperscript𝑾H\displaystyle\mathbb{E}\left[\mathrm{tr}\left({\bm{W}}\tilde{{\bm{y}}}\tilde{{% \bm{y}}}^{\mathrm{H}}{\bm{W}}^{\mathrm{H}}\right)\right]blackboard_E [ roman_tr ( bold_italic_W over~ start_ARG bold_italic_y end_ARG over~ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (69)
=𝔼[tr(𝑾(𝒉𝒉H+𝒏~𝒏~H)𝑾H)]absent𝔼delimited-[]tr𝑾𝒉superscript𝒉H~𝒏superscript~𝒏Hsuperscript𝑾H\displaystyle\quad\quad=\mathbb{E}\left[\mathrm{tr}\left({\bm{W}}\left({\bm{h}% }{\bm{h}}^{\mathrm{H}}+\tilde{{\bm{n}}}\tilde{{\bm{n}}}^{\mathrm{H}}\right){% \bm{W}}^{\mathrm{H}}\right)\right]= blackboard_E [ roman_tr ( bold_italic_W ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + over~ start_ARG bold_italic_n end_ARG over~ start_ARG bold_italic_n end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) bold_italic_W start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (70)
=𝔼[tr(𝑪(𝑪+σ2JM𝐈M)1(𝒉𝒉H+𝒏~𝒏~H)\displaystyle\quad\quad=\mathbb{E}\bigg{[}\mathrm{tr}\bigg{(}{\bm{C}}\left({% \bm{C}}+\sigma^{2}\frac{J}{M}\mathbf{I}_{M}\right)^{-1}\left({\bm{h}}{\bm{h}}^% {\mathrm{H}}+\tilde{{\bm{n}}}\tilde{{\bm{n}}}^{\mathrm{H}}\right)= blackboard_E [ roman_tr ( bold_italic_C ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + over~ start_ARG bold_italic_n end_ARG over~ start_ARG bold_italic_n end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT )
×(𝑪+σ2JM𝐈M)1𝑪)]\displaystyle\quad\quad\quad\times\left({\bm{C}}+\sigma^{2}\frac{J}{M}\mathbf{% I}_{M}\right)^{-1}{\bm{C}}\bigg{)}\bigg{]}× ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ) ] (71)
=tr(𝑪(𝑪+σ2JM𝐈M)1𝑪),absenttr𝑪superscript𝑪superscript𝜎2𝐽𝑀subscript𝐈𝑀1𝑪\displaystyle\quad\quad=\mathrm{tr}\left({\bm{C}}\left({\bm{C}}+\sigma^{2}% \frac{J}{M}\mathbf{I}_{M}\right)^{-1}{\bm{C}}\right),= roman_tr ( bold_italic_C ( bold_italic_C + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_C ) , (72)

where we assume that 𝔼[𝒏~𝒏~H]=σ2JM𝐈M𝔼delimited-[]~𝒏superscript~𝒏Hsuperscript𝜎2𝐽𝑀subscript𝐈𝑀\mathbb{E}\left[\tilde{{\bm{n}}}\tilde{{\bm{n}}}^{\mathrm{H}}\right]=\sigma^{2% }\frac{J}{M}\mathbf{I}_{M}blackboard_E [ over~ start_ARG bold_italic_n end_ARG over~ start_ARG bold_italic_n end_ARG start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ] = italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG italic_J end_ARG start_ARG italic_M end_ARG bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT. From this the overall MSE in (28) follows directly.

-B MSE of Subspace LMMSE for Rayleigh Fading

In the case of uncorrelated Rayleigh fading the subspace LMMSE filter is given as

𝑾sub=11+σ2𝑽𝑽H.subscript𝑾sub11superscript𝜎2𝑽superscript𝑽H\displaystyle{\bm{W}}_{\mathrm{sub}}=\frac{1}{1+\sigma^{2}}{\bm{V}}{\bm{V}}^{% \mathrm{H}}.bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT . (73)

Using this filter the second term in (64) can be rewritten as

𝔼[tr(𝒉𝒚H𝑾subH)]𝔼delimited-[]tr𝒉superscript𝒚Hsuperscriptsubscript𝑾subH\displaystyle\mathbb{E}\left[\mathrm{tr}\left({\bm{h}}{\bm{y}}^{\mathrm{H}}{% \bm{W}}_{\mathrm{sub}}^{\mathrm{H}}\right)\right]blackboard_E [ roman_tr ( bold_italic_h bold_italic_y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (74)
=𝔼𝒉[𝔼[tr(𝒉𝒉H𝑾subH)𝒉]]absentsubscript𝔼𝒉delimited-[]𝔼delimited-[]conditionaltr𝒉superscript𝒉Hsuperscriptsubscript𝑾subH𝒉\displaystyle\quad\quad=\mathbb{E}_{\bm{h}}\left[\mathbb{E}\left[\mathrm{tr}% \left({\bm{h}}{\bm{h}}^{\mathrm{H}}{\bm{W}}_{\mathrm{sub}}^{\mathrm{H}}\right)% \mid{\bm{h}}\right]\right]= blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ] (75)
=11+σ2𝔼𝒉[𝔼[tr(𝒉𝒉H𝑽𝑽H)𝒉]]absent11superscript𝜎2subscript𝔼𝒉delimited-[]𝔼delimited-[]conditionaltr𝒉superscript𝒉H𝑽superscript𝑽H𝒉\displaystyle\quad\quad=\frac{1}{1+\sigma^{2}}\mathbb{E}_{\bm{h}}\left[\mathbb% {E}\left[\mathrm{tr}\left({\bm{h}}{\bm{h}}^{\mathrm{H}}{\bm{V}}{\bm{V}}^{% \mathrm{H}}\right)\mid{\bm{h}}\right]\right]= divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ] (76)
=11+σ2𝔼𝒉[tr(𝒉𝒉H)]absent11superscript𝜎2subscript𝔼𝒉delimited-[]tr𝒉superscript𝒉H\displaystyle\quad\quad=\frac{1}{1+\sigma^{2}}\mathbb{E}_{\bm{h}}\left[\mathrm% {tr}\left({\bm{h}}{\bm{h}}^{\mathrm{H}}\right)\right]= divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (77)
=11+σ2M.absent11superscript𝜎2𝑀\displaystyle\quad\quad=\frac{1}{1+\sigma^{2}}M.= divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_M . (78)

Similarly for the third term in (64) we have

𝔼[tr(𝑾sub𝒚𝒚H𝑾subH)]𝔼delimited-[]trsubscript𝑾sub𝒚superscript𝒚Hsuperscriptsubscript𝑾subH\displaystyle\mathbb{E}\left[\mathrm{tr}\left({\bm{W}}_{\mathrm{sub}}{\bm{y}}{% \bm{y}}^{\mathrm{H}}{\bm{W}}_{\mathrm{sub}}^{\mathrm{H}}\right)\right]blackboard_E [ roman_tr ( bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT bold_italic_y bold_italic_y start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] (79)
=𝔼𝒉[𝔼[tr(𝑾sub𝒉𝒉H𝑾subH)𝒉]]absentsubscript𝔼𝒉delimited-[]𝔼delimited-[]conditionaltrsubscript𝑾sub𝒉superscript𝒉Hsuperscriptsubscript𝑾subH𝒉\displaystyle\quad\quad=\mathbb{E}_{\bm{h}}\left[\mathbb{E}\left[\mathrm{tr}% \left({\bm{W}}_{\mathrm{sub}}{\bm{h}}{\bm{h}}^{\mathrm{H}}{\bm{W}}_{\mathrm{% sub}}^{\mathrm{H}}\right)\mid{\bm{h}}\right]\right]= blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ]
+𝔼𝒉[𝔼[tr(𝑾sub𝒏𝒏H𝑾subH)𝒉]]subscript𝔼𝒉delimited-[]𝔼delimited-[]conditionaltrsubscript𝑾sub𝒏superscript𝒏Hsuperscriptsubscript𝑾subH𝒉\displaystyle\quad\quad\quad+\mathbb{E}_{\bm{h}}\left[\mathbb{E}\left[\mathrm{% tr}\left({\bm{W}}_{\mathrm{sub}}{\bm{n}}{\bm{n}}^{\mathrm{H}}{\bm{W}}_{\mathrm% {sub}}^{\mathrm{H}}\right)\mid{\bm{h}}\right]\right]+ blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT bold_italic_n bold_italic_n start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_W start_POSTSUBSCRIPT roman_sub end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ] (80)
=1(1+σ2)2[𝔼𝒉[𝔼[tr(𝑽𝑽H𝒉𝒉H𝑽𝑽H)𝒉]]\displaystyle\quad\quad=\frac{1}{(1+\sigma^{2})^{2}}\Big{[}\mathbb{E}_{\bm{h}}% \left[\mathbb{E}\left[\mathrm{tr}\left({\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{h}}{% \bm{h}}^{\mathrm{H}}{\bm{V}}{\bm{V}}^{\mathrm{H}}\right)\mid{\bm{h}}\right]\right]= divide start_ARG 1 end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG [ blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ]
+𝔼𝒉[𝔼[tr(𝑽𝑽H𝒏𝒏H𝑽𝑽H)𝒉]]]\displaystyle\quad\quad\quad+\mathbb{E}_{\bm{h}}\left[\mathbb{E}\left[\mathrm{% tr}\left({\bm{V}}{\bm{V}}^{\mathrm{H}}{\bm{n}}{\bm{n}}^{\mathrm{H}}{\bm{V}}{% \bm{V}}^{\mathrm{H}}\right)\mid{\bm{h}}\right]\right]\Big{]}+ blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ blackboard_E [ roman_tr ( bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_n bold_italic_n start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ∣ bold_italic_h ] ] ] (81)
=1(1+σ2)2[𝔼𝒉[tr(𝒉𝒉H)]+𝔼𝒉[σ2tr(𝑽𝑽H)]]absent1superscript1superscript𝜎22delimited-[]subscript𝔼𝒉delimited-[]tr𝒉superscript𝒉Hsubscript𝔼𝒉delimited-[]superscript𝜎2tr𝑽superscript𝑽H\displaystyle\quad\quad=\frac{1}{(1+\sigma^{2})^{2}}\left[\mathbb{E}_{\bm{h}}% \left[\mathrm{tr}\left({\bm{h}}{\bm{h}}^{\mathrm{H}}\right)\right]+\mathbb{E}_% {\bm{h}}\left[\sigma^{2}\mathrm{tr}\left({\bm{V}}{\bm{V}}^{\mathrm{H}}\right)% \right]\right]= divide start_ARG 1 end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG [ blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ roman_tr ( bold_italic_h bold_italic_h start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] + blackboard_E start_POSTSUBSCRIPT bold_italic_h end_POSTSUBSCRIPT [ italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_tr ( bold_italic_V bold_italic_V start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) ] ] (82)
=1(1+σ2)2(M+Jσ2).absent1superscript1superscript𝜎22𝑀𝐽superscript𝜎2\displaystyle\quad\quad=\frac{1}{(1+\sigma^{2})^{2}}(M+J\sigma^{2}).= divide start_ARG 1 end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( italic_M + italic_J italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (83)

The overall MSE of the subspace variant for 𝑪=𝐈M𝑪subscript𝐈𝑀{\bm{C}}=\mathbf{I}_{M}bold_italic_C = bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT is then

MSEiidsubsuperscriptsubscriptMSEiidsub\displaystyle\mathrm{MSE}_{\mathrm{iid}}^{\mathrm{sub}}roman_MSE start_POSTSUBSCRIPT roman_iid end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_sub end_POSTSUPERSCRIPT =M211+σ2M+1(1+σ2)2(M+Jσ2)absent𝑀211superscript𝜎2𝑀1superscript1superscript𝜎22𝑀𝐽superscript𝜎2\displaystyle=M-2\frac{1}{1+\sigma^{2}}M+\frac{1}{(1+\sigma^{2})^{2}}(M+J% \sigma^{2})= italic_M - 2 divide start_ARG 1 end_ARG start_ARG 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_M + divide start_ARG 1 end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( italic_M + italic_J italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) (84)
=σ2(Mσ2+J)(1+σ2)2.absentsuperscript𝜎2𝑀superscript𝜎2𝐽superscript1superscript𝜎22\displaystyle=\frac{\sigma^{2}(M\sigma^{2}+J)}{(1+\sigma^{2})^{2}}.= divide start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_M italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_J ) end_ARG start_ARG ( 1 + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (85)

References

  • [1] F. Weißer, N. Turan, D. Semmler, and W. Utschick, “Data-aided channel estimation utilizing Gaussian mixture models,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 8886–8890.
  • [2] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Processing Magazine, vol. 30, no. 1, pp. 40–60, 2013.
  • [3] Y. Kabalci, 5G Mobile Communication Systems: Fundamentals, Challenges, and Key Technologies.   Singapore: Springer, 2019, pp. 329–359. [Online]. Available: https://doi.org/10.1007/978-981-13-1768-2_10
  • [4] H. Ye, G. Y. Li, and B.-H. Juang, “Power of deep learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Communications Letters, vol. 7, no. 1, pp. 114–117, 2018.
  • [5] Z. Shang, T. Zhang, G. Hu, Y. Cai, and W. Yang, “Secure transmission for NOMA-based cognitive radio networks with imperfect CSI,” IEEE Communications Letters, vol. 25, no. 8, pp. 2517–2521, 2021.
  • [6] H. Harkat, P. Monteiro, A. Gameiro, F. Guiomar, and H. Farhana Thariq Ahmed, “A survey on MIMO-OFDM systems: Review of recent trends,” Signals, vol. 3, no. 2, pp. 359–395, 2022.
  • [7] S. C and J. Sandeep, “A review of channel estimation mechanisms in wireless communication networks,” in 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), 2021, pp. 603–608.
  • [8] T. L. Marzetta, “How much training is required for multiuser MIMO?” in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, 2006, pp. 359–363.
  • [9] E. De Carvalho and D. Slock, “Cramer-Rao bounds for semi-blind, blind and training sequence based channel estimation,” in First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications, 1997, pp. 129–132.
  • [10] ——, “Asymptotic performance of ML methods for semi-blind channel estimation,” in Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136), vol. 2, 1997, pp. 1624–1628 vol.2.
  • [11] A. Medles and D. Slock, “Augmenting the training sequence part in semiblind estimation for MIMO channels,” in The Thirty-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, 2003, pp. 1825–1829 Vol.2.
  • [12] J. Ma and L. Ping, “Data-aided channel estimation in large antenna systems,” IEEE Transactions on Signal Processing, vol. 62, no. 12, pp. 3111–3124, 2014.
  • [13] M. Joham, W. Utschick, J. A. Nossek, and M. D. Zoltowski, “Semi-blind channel estimation: a new least-squares approach,” in International Conference on Telecommunications, Cheju Island, Korea, 1999, pp. 416–420.
  • [14] S. Park, B. Shim, and J. W. Choi, “Iterative channel estimation using virtual pilot signals for MIMO-OFDM systems,” IEEE Transactions on Signal Processing, vol. 63, no. 12, pp. 3032–3045, 2015.
  • [15] D. Neumann, M. Joham, and W. Utschick, “Channel estimation in massive MIMO systems,” 2015. [Online]. Available: https://arxiv.org/abs/1503.08691
  • [16] E. Nayebi and B. D. Rao, “Semi-blind channel estimation for multiuser massive MIMO systems,” IEEE Transactions on Signal Processing, vol. 66, no. 2, pp. 540–553, 2018.
  • [17] Y. Liu, L. Brunel, and J. J. Boutros, “Joint channel estimation and decoding using Gaussian approximation in a factor graph over multipath channel,” in 2009 IEEE 20th International Symposium on Personal, Indoor and Mobile Radio Communications, 2009, pp. 3164–3168.
  • [18] S. Wu, L. Kuang, Z. Ni, D. Huang, Q. Guo, and J. Lu, “Message-passing receiver for joint channel estimation and decoding in 3D massive MIMO-OFDM systems,” IEEE Transactions on Wireless Communications, vol. 15, no. 12, pp. 8122–8138, 2016.
  • [19] A. Mehrotra, S. Srivastava, A. K. Jagannatham, and L. Hanzo, “Data-aided CSI estimation using affine-precoded superimposed pilots in orthogonal time frequency space modulated MIMO systems,” IEEE Transactions on Communications, vol. 71, no. 8, pp. 4482–4498, 2023.
  • [20] A. Osinsky, A. Ivanov, D. Lakontsev, R. Bychkov, and D. Yarotsky, “Data-aided LS channel estimation in massive MIMO turbo-receiver,” in 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), 2020, pp. 1–5.
  • [21] M. Liu, M. Crussiere, and J.-F. Helard, “A novel data-aided channel estimation with reduced complexity for TDS-OFDM systems,” IEEE Transactions on Broadcasting, vol. 58, no. 2, pp. 247–260, 2012.
  • [22] I. Khan, M. Cheffena, and M. M. Hasan, “Data aided channel estimation for MIMO-OFDM wireless systems using reliable carriers,” IEEE Access, vol. 11, pp. 47 836–47 847, 2023.
  • [23] T.-K. Kim, Y.-S. Jeon, J. Li, N. Tavangaran, and H. V. Poor, “Semi-data-aided channel estimation for MIMO systems via reinforcement learning,” IEEE Transactions on Wireless Communications, vol. 22, no. 7, pp. 4565–4579, 2023.
  • [24] I. Khan, M. M. Hasan, and M. Cheffena, “A novel low-complexity peak-power-assisted data-aided channel estimation scheme for MIMO-OFDM wireless systems,” 2024. [Online]. Available: https://arxiv.org/abs/2410.05722
  • [25] N. Zilberstein, A. Swami, and S. Segarra, “Joint channel estimation and data detection in massive MIMO systems based on diffusion models,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 291–13 295.
  • [26] Y. Deng and T. Ohtsuki, “Low-complexity subspace MMSE channel estimation in massive MU-MIMO system,” IEEE Access, vol. 8, pp. 124 371–124 381, 2020.
  • [27] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory.   Englewood Cliffs, NJ: Prentice-Hall, Inc., 1993.
  • [28] J. Yang, X. Liao, X. Yuan, P. Llull, D. J. Brady, G. Sapiro, and L. Carin, “Compressive sensing by learning a Gaussian mixture model from measurements,” IEEE Transactions on Image Processing, vol. 24, no. 1, pp. 106–119, 2015.
  • [29] D. Neumann, T. Wiese, and W. Utschick, “Learning the MMSE channel estimator,” IEEE Transactions on Signal Processing, vol. 66, no. 11, pp. 2905–2917, 2018.
  • [30] M. Koller, B. Fesl, N. Turan, and W. Utschick, “An asymptotically MSE-optimal estimator based on Gaussian mixture models,” IEEE Transactions on Signal Processing, vol. 70, pp. 4109–4123, 2022.
  • [31] B. Fesl, N. Turan, and W. Utschick, “Low-rank structured MMSE channel estimation with mixtures of factor analyzers,” in 2023 57th Asilomar Conference on Signals, Systems, and Computers, 2023, pp. 375–380.
  • [32] M. Baur, B. Fesl, and W. Utschick, “Leveraging variational autoencoders for parameterized MMSE estimation,” IEEE Transactions on Signal Processing, vol. 72, pp. 3731–3744, 2024.
  • [33] N. Turan, B. Fesl, M. Koller, M. Joham, and W. Utschick, “A versatile low-complexity feedback scheme for FDD systems via generative modeling,” IEEE Transactions on Wireless Communications, vol. 23, no. 6, pp. 6251–6265, 2024.
  • [34] F. Weißer, D. Semmler, N. Turan, and W. Utschick, “Data-aided MU-MIMO channel estimation utilizing Gaussian mixture models,” in ICC 2024 - IEEE International Conference on Communications, 2024, pp. 6684–6689.
  • [35] 3GPP, “Spatial channel model for multiple input multiple output (MIMO) simulations,” 3rd Generation Partnership Project (3GPP), Tech. Rep. 25.996 v16.0.0, 2020.
  • [36] N. Turan, B. Fesl, M. Grundei, M. Koller, and W. Utschick, “Evaluation of a Gaussian mixture model-based channel estimator using measurement data,” in 2022 International Symposium on Wireless Communication Systems (ISWCS), 2022, pp. 1–6.
  • [37] E. Björnson, J. Hoydis, and L. Sanguinetti, “Massive MIMO networks: Spectral, energy, and hardware efficiency,” Foundations and Trends® in Signal Processing, vol. 11, no. 3-4, pp. 154–655, 2017.
  • [38] V. Milman and G. Schechtman, Asymptotic Theory Of Finite Dimensional Normed Spaces, ser. Lecture Notes in Mathematics.   Springer, 1986, vol. 1200.
  • [39] B. Böck, M. Baur, N. Turan, D. Semmler, and W. Utschick, “A statistical characterization of wireless channels conditioned on side information,” IEEE Wireless Communications Letters, pp. 1–1, 2024.
  • [40] T. T. Nguyen, H. D. Nguyen, F. Chamroukhi, and G. J. McLachlan, “Approximation by finite mixtures of continuous density functions that vanish at infinity,” Cogent Mathematics & Statistics, vol. 7, no. 1, p. 1750861, 2020.
  • [41] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics).   Berlin, Heidelberg: Springer-Verlag, 2006.
  • [42] D. P. Kingma and M. Welling, “An introduction to variational autoencoders,” Foundations and Trends® in Machine Learning, vol. 12, no. 4, pp. 307–392, 2019.
  • [43] M. Baur, B. Böck, N. Turan, and W. Utschick, “Variational autoencoder for channel estimation: Real-world measurement insights,” in 2024 27th International Workshop on Smart Antennas (WSA), 2024, pp. 117–122.
  • [44] W. Utschick, “Tracking of signal subspace projectors,” IEEE Transactions on Signal Processing, vol. 50, no. 4, pp. 769–778, 2002.
  • [45] B. Yang, “Projection approximation subspace tracking,” IEEE Transactions on Signal Processing, vol. 43, no. 1, pp. 95–107, 1995.