Rectangular Patch Antennas
Rectangular Patch Antennas
A BSTRACT
We propose a two-stage deep learning framework for the inverse design of rectangular patch antennas.
Our approach leverages generative modeling to learn a latent representation of antenna frequency
response curves and conditions a subsequent generative model on these responses to produce feasible
antenna geometries. We further demonstrate that leveraging search and optimization techniques
at test-time improves the accuracy of the generated designs and enables consideration of auxiliary
objectives such as manufacturability. Our approach generalizes naturally to different design criteria,
and can be easily adapted to more complex geometric design spaces.
1 Introduction
In our increasingly wireless world, antennas serve as the fundamental link between radio frequency (RF) waves and
electronic devices, and enable essential functions such as communication, navigation, and sensing across diverse systems,
including GPS, WiFi, Bluetooth, and cellular networks. Among these, patch antennas—metallic patches printed onto
dielectric substrates—are particularly attractive due to their low cost, low profile, and ease of fabrication. Rectangular
patch antennas, in particular, are widely utilized in mobile devices and other space-constrained applications [1].
For a patch antenna to be implemented for a specific use case, its behavior needs to be tuned to maximize efficiency.
Specifically, the antenna should easily receive and transmit power at certain frequency ranges, while minimizing
unwanted radiation or reception at other ranges. Each consisting of a center frequency and bandwidth, these frequency
ranges characterize the frequency response of the antenna. Tuning the frequency response of the antenna has the
effect of filtering out unwanted signals during reception, and maximizing power converted into radiation at the desired
frequency during transmission.
However, designing patch antennas for optimal electromagnetic (EM) performance remains a time-consuming and
iterative process. Engineers typically rely on combinations of analytic approximations and full-wave EM simulations
—often based on numerical methods such as the Finite Element Method (FEM) or the Finite-Difference Time-Domain
method (FDTD)— to match desired frequency responses [11, 23]. While analytic models or lumped circuit approxi-
mations can guide preliminary geometric parameter selection, they are limited to simple geometries and often yield
only coarse solutions [1]. Additionally, adjusting one geometric parameter, such as the patch width, can simultaneously
affect multiple aspects of the antenna’s frequency response, necessitating careful, often tedious manual tuning.
Deep learning design methods present a promising alternative by learning complex, nonlinear relationships directly
from data and enabling a more automated and scalable inverse design process. Inverse design, in this context, involves
starting from target EM behavior and determining the antenna geometry that produces it. Past breakthroughs in inverse
design for other scientific domains, such as protein folding [12] and materials discovery [16], demonstrate the potential
of deep learning to achieve results beyond conventional heuristics. Much like these domains, patch antenna design
involves the navigation of a vast configuration space. Yet, we face notable challenges when applying deep learning in
this context:
A PREPRINT - M AY 28, 2025
1. Data Scarcity: The absence of large, standardized datasets necessitates expensive simulation campaigns to
train models effectively.
2. Non-unique solutions: The one-to-many nature of the inverse mapping —from a desired frequency response
to multiple feasible geometries— further complicates direct regression approaches [6].
Addressing these obstacles requires specialized frameworks that balance multimodality and data efficiency.
In this work, we propose a novel two-stage generative framework for the inverse design of rectangular patch antennas
that leverages variational autoencoders to model distributions over feasible electromagnetic responses and corresponding
antenna geometries. The framework’s adversarial training process enables controlled generation while its probabilistic
nature addresses the one-to-many mapping challenge inherent to inverse design. Through our experiments, we show
that simple search and optimization techniques employed at test time enhance design accuracy and practicability while
reducing sensitivity to limited training data. Finally, while demonstrated specifically for rectangular patch antennas, we
discuss how our approach naturally generalizes to arbitrary design criteria and more complex geometric design spaces.
2 Related Work
Patch Antenna Design Surrogate models serve as computationally inexpensive proxies for time-consuming EM
simulations, enabling rapid evaluation of candidate designs. Early attempts employed simple linear regression or
Support Vector Machines (SVMs) to approximate forward simulations of patch antennas [3, 4]. Although these models
significantly accelerate the design process by replacing full-wave simulations, they still require iterative fine-tuning by
human engineers, and their accuracy is limited by the complexity of the underlying EM phenomena. To fully automate
the inverse design process, antenna geometries must be generated automatically from target frequency specifications.
Sharma et al. [20] constructed a surrogate-assisted approach to enumerate a fine grid of design parameters and selected
geometries that met performance criteria. While effective, this approach suffers from poor scalability as the design space
grows in dimensionality. Reinforcement learning (RL) has been proposed as a data-driven alternative to systematic
search. Wei et al. [25] formulated slot antenna design as a Markov Decision Process and trained a deep RL agent
to iteratively refine geometric parameters until the target frequency response was achieved. The RL framework also
integrated a surrogate model to limit expensive simulations, thereby overcoming both the data availability bottleneck
and the one-to-many challenge of inverse design.
Controlled Generation Generative models often aim to produce outputs conditioned on specified target attributes.
Variational Autoencoders (VAEs) [13] are a popular class of generative models that learn a latent representation of data
by encoding inputs into a latent space and then reconstructing them via a decoder. To enable controlled generation of
samples with specified attributes, VAEs were extended to Conditional Variational Autoencoders (CVAEs) [22], where
an additional input to the decoder encodes the desired features of the output. However, the decoder can learn to ignore
the conditional input if the latent code alone suffices to reconstruct the data. A solution to this problem is adversarial
disentanglement, where an auxiliary predictor attempts to infer the condition from the latent code, while the encoder
is penalized for revealing this information. This approach, employed in [7, 15], encourages the model to encode all
task-related information about the condition in the explicit conditional pathway, thus enabling controlled generation.
An alternative approach to controlled generation is to decouple the learning of the distribution from the conditioning
altogether, training an unconditional Variational Autoencoder (VAE) on the data and then searching in its latent space to
find a sample matching desired criteria [8, 17]. This strategy avoids having to feed potentially out-of-distribution targets
directly to the decoder; however, it places a greater computational burden on the search procedure and can become
prohibitively expensive for large or high-dimensional search spaces.
In our work, we combine the benefits of both strategies through a two-stage generative pipeline. First, we train an
unconditional VAE on physically valid frequency responses, allowing us to steer the latent code toward responses that
remain in-distribution. Second, we train an adversarially disentangled CVAE to map those frequency responses to
design parameters. This hybrid approach mitigates the pitfalls of purely conditional generation on out-of-distribution
inputs, while also reducing the burden of exhaustive search in unconditional latent-space.
Test-Time Compute Test-time compute refers to the computational resources devoted to completing a task during
inference. Sufficiently powerful test-time compute schemes can significantly boost performance, even when training
data or model capacity is held constant. A seminal example of such an approach can be found in AlphaGo, which
pioneered the concept of extensive test-time computation through Monte Carlo Tree Search (MCTS) combined with
deep neural networks [21]. Recent success has been found in scaling test-time compute in large language model (LLM)
systems to boost performance on different tasks. AlphaGeometry approached gold-medalist level performance on
International Mathematical Olympiad geometry problems by extensively scaling beam search with a symbolic deduction
2
A PREPRINT - M AY 28, 2025
engine for verification [24]. Similarly, FunSearch made discoveries in open mathematics problems by combining
systematic verification with evolutionary search through millions of programs [18]. More recently, [2] demonstrated
that test-time compute through latent program search can be leveraged to outperform strong priors (pretrained LLMs)
on the Abstraction and Reasoning Corpus (ARC) [5] benchmark. Building upon approaches that have worked well in
photonics, we design our framework with test-time compute mechanisms to increase data efficiency and enable better
generalization.
3 Background
In this work, we focus on a coaxial-fed rectangular patch, where a feed pin passes through a ground plane to a metallic
patch on a dielectric substrate (cf. Figure 1). The design configuration of a single coaxial-fed rectangular patch antenna
is parametrized by (L, W, p), where L is the length of the patch in mm, W is the width of the patch in mm, and p is the
distance of the feed point from the center of the patch along the length axis.
The patch can be treated as a resonant segment of a transmission line, with L primarily controlling the resonant
frequency and W influencing impedance and bandwidth. The feed position p is tuned to achieve an impedance match
(commonly 50 Ω) with the feed at a desired resonant frequency. Our primary goal is to generate antenna geometries
(L, W, p) that satisfy specific frequency-domain performance requirements, encoded in the reflection coefficient S11 (f ).
4 Dataset
We build a dataset of 1292 design configurations and corresponding S11 frequency response curves to train our inverse
design framework.
To generate a set of (L, W, p) to be simulated, we begin by computing a grid of configurations with the bounds
L = [7.5, 52.5], W L
L ratio = [0.8, 2], and p = [−6, 0) while enforcing p = (− 2 , 0) and sampling at higher density
at small L and p close to the edge of the patch. We then augment this initial set of designs by using an algorithm
designed to sample additional triplets (L, W, p) inside the convex hull of the existing dataset while enforcing uniformity.
Ultimately, we end up with a set of 1292 design configurations.
To simulate the designs, we use openEMS [14], an open source electromagnetic field solver based on the Finite-
Difference Time Domain (FDTD) method. We fix substrate parameters ϵr = 3.68 and thickness = 1.61mm to align
with those provided by OSH Park’s 4 layer prototype service. To calculate S11 frequency response curves, we excite the
antenna with a Gaussian pulse centered at f0 = 5.5 GHz and a bandwidth defined by a cutoff frequency fc = 4.5 GHz
to cover the frequency range of interest, f ∈ [1GHz, 10GHz].
From the port data extracted through simulation, we obtain the complex amplitudes of the incident and reflected fields
at N = 1000 regularly spaced frequencies in this range and compute the reflection coefficient (S11 ) as the ratio of the
reflected wave (uref ) to the incident wave (uinc ), converted to decibels,
uref (fi )
|S11 (fi )|dB = 20 log10 , i = 1, . . . , N (1)
uinc (fi )
3
A PREPRINT - M AY 28, 2025
Figure 2: Overview of our two-stage generative inverse design framework. Stage 1 learns a latent representation of S11
frequency response curves and finds in-distribution curves matching target responses. Stage 2 uses a conditional VAE
to generate antenna geometries that produce the desired EM response. Red arrows represent test time optimization.
5 Methodology
5.1 Problem Setup and Overview
We consider antennas defined by x = (L, W, p), where L and W are the patch dimensions and p is the feed position.
Each design x yields a frequency-dependent reflection coefficient S11 (f ), sampled at N = 1000 frequencies. We
represent this response as a 1000-dimensional vector y ∈ RN , where we convert the response curve to dB, see
Equation (1).
We define our design criteria as a set of K desired resonant frequencies fres = {f1 , . . . , fK } with their corresponding
bandwidths BW = {BW1 , . . . , BWK } and depths d = {d1 , . . . , dK }. A lower |S11 |dB near a desired frequency band
indicates efficient power transfer and radiation, whereas higher |S11 |dB elsewhere can mitigate interference. This means
we are searching for antenna that have an idealized target response curve modeled as a product of Lorentzian notches,
K
(BWk /2)2
Y dk
∗
S11 (f | fres , BW, d) = 1 − (1 − 10 ) 1 − 1 −
20 , (2)
(f − fk )2 + (BWk /2)2
k=1
where fk denotes the resonant frequency where minimal reflection is desired, BWk specifies the width of the frequency
band around fk with efficient power transfer, and dk sets the target depth (in dB) of the reflection at fk . For example,
choosing parameters (fk = 2.4 GHz, BWk = 0.2 GHz, dk = −15 dB) describes an antenna with efficient radiation
around 2.4 GHz and reflection quickly increasing outside this range.
From this response curve, we convert to idealized target dB values
y ∗ (fi | fres , BW, d) = 20 log10 |S11
∗
(fi | fres , BW, d)|, i = 1, . . . N . (3)
Of course, antennas with these exact frequency response curves do not exist in practice (at the very least, there will
be higher harmonics). Thus, we are searching for antenna designs whose frequency response curve has a depth y(fk )
of at least dk in a frequency band around fk , and whose response remains shallower than a threshold for all other
(non-resonant) frequencies. This means we will not consider y-values deeper than this threshold, in frequency ranges
that are irrelevant to our target response in our search.
Our inverse design goal is as follows: given an idealized target response y ∗ (fres , BW, d), find antenna design parameters
x̃ such that φEMS (x̃) ≃ y ∗ , where φEMS represents the forward EM simulation [14]. Since φEMS is expensive to evaluate
and the preimage y ∗ 7→ x̃ is not unique, we propose a two-stage generative pipeline (see Fig. 2):
1. Stage 1: Latent Representation of S11 -Responses. We train a VAE that encodes y ∈ RN into a latent vector
zy ∈ R64 , capturing the key variations of physically realizable frequency responses. At test time, we search
the latent space of this VAE to find an approximation ỹ of y ∗ (fres , BW, d).
4
A PREPRINT - M AY 28, 2025
2. Stage 2: Conditional Design Generation. We train a conditional VAE (CVAE) whose encoder maps x ∈ R3
to zx ∈ R16 , and a decoder that maps zx conditioned on the desired frequency response ỹ to a design x̃. To
ensure x̃ depends both on zx and ỹ (rather than just zx ), we use an adversarial predictor that discourages zx
from leaking information about ỹ. At test time, we use this CVAE to decode x̃ from ỹ and zx .
We further refine the accuracy of the generated design through (1) best-of-N sampling of zy , zx , and (2) gradient-based
optimization of each zx based on secondary constraints, e.g., physical feasibility or specific manufacturing limits,
without losing the desired EM response.
To enable search through the distribution of feasible S11 frequency response curves, we train a Variational Autoencoder
[13] (VAE) on our dataset. The VAE defines
zy ∼ qϕ (zy |y), y ∼ pθ (y|zy ) ,
where qϕ (zy |y) is the approximate posterior (encoded by convolutional layers) and pθ (y|zy ) is the likelihood (decoded
by transposed convolutional layers). We assume a Gaussian prior p(zy ) = N (0, I).
We employ a β-VAE style objective [10], with β < 1 chosen to strike the best generative quality in our experiments,
LVAE = Eqϕ (zy |y) [log pθ (y|zy )] − β DKL (qϕ (zy |y)∥p(zy )) .
This encourages a structured latent space capturing essential variations in physically realizable S11 -responses.
Using the idealized target response y∗ computed from (2) via (1) in the CVAE can lead to undesirable results as it is
potentially out of the distribution of realizable y. Instead, we search the VAE latent space to find latent variables zy
that decode to ỹ ≈ y which has the same depths at frequency bands as our idealized target y ∗ , but can differ from y ∗
in regions we do not care about in our design goals (as mentioned above, antennas producing the idealized frequency
response curve do not exist). To find good latent space vectors, we perform a gradient-based search over the latent
(0)
space zy . We choose the starting point zy using one of two strategies (random or nearest neighbor), which we explore
further in Section 6.2, and define the objective
zy(t+1) = zy(t) − α∇zy ∥pθ (y|zy ) − y ∗ ∥2masked + λreg ∥zy ∥2 ,
where α is the learning rate. In the loss, we mask y in frequency regions that are irrelevant for our desired response
curves, ensuring we only penalize differences in regions of interest. We also add a regularization term λreg ∥zy ∥2 to
keep the solution near the prior. This iterative optimization yields a zy∗ whose decoded response ỹ closely approximates
y ∗ and remains on the manifold of physically realizable curves.
To generate antenna designs x with a frequency response y, we train a conditional VAE (CVAE):
zx ∼ qφ (zx |x), x ∼ pψ (x|zx , y) ,
where qφ and pψ are feed-forward networks mapping between design parameters and latent variables.
However, during training, zx may easily learn to encode all information about x needed for reconstruction. As noted
by [7], this can lead to a pathological scenario where the decoder simply ignores the conditional input y, since the
necessary information is already present in the latent code zx . In such cases, attempts to control generation by varying
y would have no effect.
To prevent this, we follow [7] and introduce an adversarial predictor Dω that attempts to predict y directly from zx . We
let η be a hyperparameter controlling the weight of the adversarial term. The predictor tries to minimize
Lpred = Eqφ (zx |x) [∥y − Dω (zx )∥2 ] .
The encoder tries to maximize this error (i.e., make it hard to predict y from zx ). Incorporating this into the CVAE
training, we obtain a combined objective:
LCVAE,combined = Eqφ (zx |x,y) [log pψ (x|zx , y)] − βx DKL (qφ (zx |x, y)∥p(zx )) − ηLpred ,
5
A PREPRINT - M AY 28, 2025
where Lpred is the adversarial term from the encoder’s perspective (the encoder seeks to increase Lpred , the predictor
seeks to decrease it). By this minimax interplay, the encoder removes direct correlation between zx and y.
Since x maps to a unique y via the EM solver, forcing the encoder to remove y-information from zx ensures the decoder
must rely on the explicit conditional input y to reconstruct the design. This yields a controllable model: changes in y
at test time directly influence x, and zx can be adjusted to handle auxiliary objectives without altering the frequency
response of x.
While our two-stage framework generates feasible designs directly from target specifications, additional refinement
at test time can further improve accuracy and practicability. We consider two approaches: (1) generating multiple
candidate designs and (2) optimizing single candidates to satisfy auxiliary constraints.
Generating Multiple Candidates. By sampling multiple response curves ỹ during latent search and multiple designs
from the conditional VAE for each candidate, we obtain a pool of potential solutions. We employ two scoring methods
to rank generated design candidates at test time:
Oracle Scorer: Runs a full EM simulation and computes an MSE against the target y ∗ . Although accurate, this approach
is computationally expensive.
Surrogate Scorer: Uses a neural network trained with a β-NLL loss [19] to approximate the forward simulation. At test
time, we incorporate the surrogate’s uncertainty estimates into our score, adjusting the influence of different frequency
regions based on the model’s predictive confidence.
Both scoring methods consider only frequency regions relevant to our target response. Regions outside the specified
resonant frequency bands are masked, meaning their errors do not contribute to the score.
Optimizing a Single Candidate. For a single candidate curve, the latent code of the conditional design decoder can
be further optimized to meet auxiliary geometric criteria. For the rectangular patch antenna, we define a penalty
2
L
Lpenalty (L, W, p) = ReLU(−L) + ReLU(−W ) + ReLU − − p + ReLU(p)2 ,
2 2
2
encouraging positive dimensions and a feed position between the center and edge of the patch. We then use this penalty
to conduct a gradient-based search over the latent variable zx in the same way as described in Section 5.3 for zy .
6 Experiments
Stage 1 (VAE). We use a convolutional Variational Autoencoder (VAE) to map 1000-dimensional frequency response
curves into a 64-dimensional latent space. The encoder has five convolutional layers with ReLU activations, while the
decoder has four transposed convolutional layers with GELU activations, dropout, and batch normalization. The VAE is
trained for 250 epochs (batch size=32) using Adam (lr=10−3 ), with a KLD weight of 0.016 annealed over the first 100
epochs.
Stage 2 (CVAE). The Conditional Variational Autoencoder (CVAE) maps antenna parameters to a 16-dimensional
latent space using fully-connected layers: two linear layers with LeakyReLU activations for the encoder, and five linear
layers with GELU activations for the decoder. A convolutional head (reusing Stage 1’s encoder) extracts embeddings
from frequency response curves to condition the generation. An adversarial predictor (reusing Stage 1’s decoder with an
additional fully connected layer to map the inputs) enforces conditional dependence, using a disentanglement weight of
0.1. The CVAE is trained for 300 epochs (batch size=32) using Adam (lr=10−3 ), with a KLD weight of 0.016 annealed
over the first 50 epochs.
Surrogate Scorer. The surrogate scorer reuses the decoder architecture from Stage 1, modified to have two output
channels (mean and variance), and employs heteroscedastic Gaussian likelihood (β-NLL with β = 0.5) to quantify
predictive uncertainty. It is trained for 500 epochs (batch size=64) using the Adam optimizer (lr=5 × 10−3 ).
6
A PREPRINT - M AY 28, 2025
1000 800
800 600
600
400
400
200
200
0 0
5 10 15 20 5 10 15 20
Number of Curves Number of Designs
(a) Performance vs. Number of Curves (b) Performance vs. Number of Designs Sampled per Curve
Figure 3: Scaling performance as the number of curves (left) and the number of designs per curve (right) is increased.
The shaded regions indicate variability across runs.
Computational Resources and Training Time. All models are implemented in PyTorch using the Metal Performance
Shaders (MPS) backend and trained on an Apple M2 chip. Training times per model are: Stage 1 VAE (15 min),
Stage 2 CVAE (5 min), and Surrogate Scorer (15 min). Experiments use model checkpoints selected based on optimal
validation performance.
We conduct two investigations to determine how increasing test-time compute affects framework performance. In both
investigations, we consider three distinct target response functions, and at each search configuration, record the lowest
Surrogate Scorer score from the pool of generated antenna designs, averaged across the three targets. Results can be
seen in Figure 3.
Multiple Candidate Curves. To determine how sampling multiple candidate frequency response curves ỹ affects
design accuracy, we consider a setup where one antenna design x̃ is sampled per candidate curve, and vary the number
(0)
of candidate curves. To obtain the different curves ỹ, we explore two initialization strategies for zy :
We find that as the pool of antenna designs x̃ grows through the amount of candidate frequency response curves, both
initialization strategies yield better and more consistent predictions. Additionally, we find that early on, k-closest
initialization may offer more stable or slightly better performance than random initialization, but as the number of
curves increases, the difference in their average performance diminishes. This makes sense, since k-closest initialization
potentially offers a more principled starting point for optimization in the one-shot case.
Multiple Candidate Designs per Curve. To determine how the number of generated designs per candidate curve
affects design accuracy, we consider a setup where only one candidate response curve ỹ is sampled, and vary the number
of designs sampled from this curve. Additionally, we explore the case where zx is optimized according to the auxiliary
geometric criteria and compare it to the unoptimized performance.
Again, we find that the quality of the antennas improve as we sample multiple antenna designs x̃. Additionally, we find
that the quality of the optimized and unoptimized design agrees well across search configurations, indicating that we
have been successful in decorrelating zx from the physical response of x.
7
A PREPRINT - M AY 28, 2025
0
Target: f0 = 2.40 GHz, BW0 = 100 MHz, d0 = -15 dB
10
|S11|dB
15
20
10
|S11|dB
15
20
25 Search Budget
1 curve × 1 design
10 curves × 20 designs
30
2 4 6 8 10
Frequency (GHz)
0
Target: f0 = 5.00 GHz, BW0 = 300 MHz, d0 = -10 dB
10
|S11|dB
15
20
25 Search Budget
1 curve × 1 design
10 curves × 20 designs
30
2 4 6 8 10
Frequency (GHz)
Figure 4: Comparison of the idealised target S11 curve y ∗ (black dashed), the dominant-mode analytic resonance
fr,TM10 (dotted vertical), and the simulated S11 of designs x̃ generated with two test-time compute budgets. Blue =
1 curve × 1 design, red = 10 curves × 20 designs. Note: Generated geometries may exploit higher-order or coupled
modes, so exact agreement with the analytic reference is not expected.
To illustrate the accuracy of the framework, Figure 4 compares simulated S11 responses against the idealized target
curves for antenna designs generated under two search conditions: (1) a single candidate curve is generated, from which
a single design is sampled and (2) 10 candidate curves are generated, from which 20 design curves are sampled (pool of
200), and the design with lowest Surrogate Scorer score is chosen. In both cases, we use random latent initialization
and do not optimize individual designs according to the manufacturability constraint.
The figure shows examples at f0 = 2.4 GHz, f0 = 4.0 GHz, and f0 = 5.0 GHz with varying bandwidths and
depths. In each case, it seems that devoting more compute to search over the design space at test time yields a more
accurate result. For every generated design we plot the dominant-mode analytic resonance fr,TM10 computed with the
8
A PREPRINT - M AY 28, 2025
Hammerstad–Jensen correction [9]. While agreement with the target resonance is seen in the f0 = 2.40 GHz example,
the search occasionally favors geometries whose performance hinges on higher-order or coupled modes rather than the
textbook TM10 resonance.
In general, the framework is able to produce a design that closely matches the desired response. However, the limitations
of the framework can be observed in the f0 = 5.0 GHz curve where the design fails to meet the depth (d = −10) dB
requirement even after an extensive search.
7 Conclusion
In this work, we presented a two-stage generative framework for the inverse design of rectangular patch antennas
that effectively addresses key challenges in computational antenna design. Our approach combines the strengths of
generative modeling with targeted test-time optimization to yield physically realizable antenna designs that meet desired
frequency response characteristics.
Our experimental results demonstrate that test-time computation can dramatically improve design quality with minimal
additional training data or model complexity. As shown in Section 6, both best-of-N sampling and gradient-based
optimization at test time yield progressively better designs that more closely match target frequency responses. This
finding aligns with recent successes in other domains where scalable test-time computation has unlocked performance
improvements without requiring larger models or datasets.
Our approach generalizes naturally to more complex electromagnetic design tasks. While we focused on rectangular
patch antennas with three design parameters, the same framework could be extended to patch antennas with arbitrary
geometries or to multi-element antenna arrays with more complex frequency-domain behavior. Furthermore, our
test-time optimization framework offers a flexible mechanism for incorporating auxiliary design constraints, such as
fabrication limitations or size restrictions, without compromising the primary electromagnetic performance objectives.
References
[1] Constantine A Balanis. Antenna theory. Wiley-Blackwell, Hoboken, NJ, 4 edition, January 2016.
[2] Clément Bonnet and Matthew V Macfarlane. Searching latent program spaces, 2024. URL [Link]
org/abs/2411.08706.
[3] Yiming Chen, Atef Z. Elsherbeni, and Veysel Demir. Machine learning for microstrip patch antenna design:
Observations and recommendations. In 2022 United States National Committee of URSI National Radio Science
Meeting (USNC-URSI NRSM), pages 256–257, 2022. doi: 10.23919/USNC-URSINRSM57467.2022.9881476.
[4] Yiming Chen, Atef Z. Elsherbeni, and Veysel Demir. Machine learning design of printed patch antenna. In
2022 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting
(AP-S/URSI), pages 201–202, 2022. doi: 10.1109/AP-S/USNC-URSI47032.2022.9887043.
[5] François Chollet. On the measure of intelligence, 2019. URL [Link]
[6] Hilal El Misilmani, Tarek Naous, and Salwa Al Khatib. A review on the design and optimization of antennas
using machine learning algorithms and techniques. International Journal of RF and Microwave Computer-Aided
Engineering, 2020, 07 2020. doi: 10.1002/mmce.22356.
[7] Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, and Adam Roberts. Gansynth: Adversarial
neural audio synthesis. In ICLR, 2019.
[8] Rafael Gómez-Bombarelli, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-
Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, and Alán Aspuru-
Guzik. Automatic chemical design using a data-driven continuous representation of molecules. ACS Central
Science, 4(2):268–276, Feb 2018. ISSN 2374-7943. doi: 10.1021/acscentsci.7b00572. URL [Link]
10.1021/acscentsci.7b00572.
[9] Erik Hammerstad. Equations for microstrip circuit design. 1975 5th European Microwave Conference, pages
268–272, 1975. URL [Link]
9
A PREPRINT - M AY 28, 2025
[10] Irina Higgins, Loic Matthey, Arka Pal, et al. β-vae: Learning basic visual concepts with a constrained variational
framework. In ICLR, 2017.
[11] Jianming Jin. The Finite Element Method in Electromagnetics. Wiley - IEEE. John Wiley & Sons, Nashville, TN,
3 edition, March 2014.
[12] John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn
Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon
A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain,
Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Ellen Clancy, Michal Zielinski, Martin Steinegger,
Michalina Pacholska, Tamas Berghammer, Sebastian Bodenstein, David Silver, Oriol Vinyals, Andrew W. Senior,
Koray Kavukcuoglu, Pushmeet Kohli, and Demis Hassabis. Highly accurate protein structure prediction with
alphafold. Nature, 596(7873):583–589, July 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-03819-2. URL
[Link]
[13] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. In ICLR, 2014.
[14] Thorsten Liebig. openems - open electromagnetic field solver, accessed 2024. URL [Link]
[15] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. Adversarial autoencoders.
arXiv:1511.05644, 2015.
[16] Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol, Gowoon Cheon, and Ekin Dogus Cubuk.
Scaling deep learning for materials discovery. Nature, 624(7990):80–85, November 2023. ISSN 1476-4687. doi:
10.1038/s41586-023-06735-9. URL [Link]
[17] Anh Nguyen, Jeff Clune, Yoshua Bengio, Alexey Dosovitskiy, and Jason Yosinski. Plug & play generative
networks: Conditional iterative generation of images in latent space, 2017. URL [Link]
1612.00005.
[18] Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M Pawan Kumar,
Emilien Dupont, Francisco J R Ruiz, Jordan S Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and
Alhussein Fawzi. Mathematical discoveries from program search with large language models. Nature, 625(7995):
468–475, January 2024.
[19] Maximilian Seitzer, Arash Tavakoli, Dimitrije Antic, and Georg Martius. On the pitfalls of heteroscedastic
uncertainty estimation with probabilistic neural networks, 2022. URL [Link]
[20] Yashika Sharma, Hao Helen Zhang, and Hao Xin. Machine learning techniques for optimizing design of double
t-shaped monopole antenna. IEEE Transactions on Antennas and Propagation, 68(7):5658–5663, 2020. doi:
10.1109/TAP.2020.2966051.
[21] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian
Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe,
John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore
Graepel, and Demis Hassabis. Mastering the game of go with deep neural networks and tree search. Nature, 529
(7587):484–489, January 2016.
[22] Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional
generative models. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural
Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL [Link]
[Link]/paper_files/paper/2015/file/[Link].
[23] Allen Taflove and Susan Hagness. Computational electrodynamics. Artech House antennas and propagation
library. Artech House, Norwood, MA, 3 edition, May 2005.
[24] Trieu H Trinh, Yuhuai Wu, Quoc V Le, He He, and Thang Luong. Solving olympiad geometry without human
demonstrations. Nature, 625(7995):476–482, January 2024.
[25] Zhaohui Wei, Zhao Zhou, Peng Wang, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, and Ming Shen. Automated
antenna design via domain knowledge-informed reinforcement learning and imitation learning. IEEE Transactions
on Antennas and Propagation, 71(7):5549–5557, 2023. doi: 10.1109/TAP.2023.3266051.
10