My research focuses on resilient and efficient wireless communication, from advanced spectrum sensing, interference detection and mitigation to more practical designing of real-world sensing platforms for highly constrained environments. We aim to pair high-level machine learning, low-level signal processing, PCB and RTL design approaches with state-of-the-art software-defined radio platforms to build next-generation wireless communication systems.
Please reach out to me via Email with questions, suggestions, ideas or collaboration proposals.
Please refer to the CV section for details about my career path, experiences and extracurricular activities.
Selected Publications
RippleSense: Scalable and Efficient Wideband Spectrum Sensing
Andreas Kuster
,
Yanbo Zhang
,
Mo Li
ACM/IEEE International Conference on Embedded Artificial Intelligence and Sensing Systems (SenSys'26),
May 2026
Dynamic Spectrum Sharing (DSS) is essential for optimizing spectrum utilization in modern wireless systems, but it requires high analog-to-digital converter (ADC) sampling rates, leading to increased costs and power consumption. Existing sub-Nyquist sampling techniques partially address these issues but struggle with dense spectrum adaptability, real-time performance, and efficiency.
This paper presents RippleSense, a scalable and efficient wideband spectrum sensing approach capable of capturing GHz of densely occupied spectrum at sub-Nyquist ADC sampling rates. A novel sub-Nyquist sampling method is introduced by injecting distinct signatures into observed signals over different Nyquist zones before sampling, allowing programmatically reconstructing the full spectrum even after the Nyquist zones are folded to baseband due to inadequate ADC sampling rates. To showcase the scalability of this approach, a high-performance, multi-GHz spectrum sensing platform is implemented together with a highly parallelizable reconstruction algorithm that can process the data stream in real-time.
Experimental evaluation has shown that the proposed approach supports operating configurations with signal-to-noise ratios as low as –10 dB and time resolutions down to 10 ns, enabling the capture of single radar pulses across bandwidths of up to 10 GHz using our prototype.
MC-LoRa: Multi-node Concurrent Localization for LoRaWAN Indoors and Outdoors
Han Hao
,
Wei Xi
,
Andreas Kuster
,
Amalinda Gamage
,
Xianjin Xia
UbiComp/Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT'25),
March 2025
Multi-node localization is crucial for large-scale and densely deployed Internet of Things (IoT) devices connected via LoRaWAN. Due to limitations in bandwidth and the number of RX chains (antennas), existing LoRaWAN-based localization methods often rely on frequency hopping or additional infrastructure to improve location accuracy. Although promising, these methods struggle to localize multiple nodes during packet collisions. In this paper, we propose MC-LoRa, which features a multi-node localization pipeline that includes reliable preamble detection under the near-far effect, tackling inter-symbol interference among multiple packets, and a virtual antenna array method to obtain extra channel state measurements within a single channel. This approach not only enhances angle resolution in our AoA-based system but also eliminates the need for time-consuming frequency hopping, requiring only software processing in existing gateways. Our extensive evaluation results show that MC-LoRa achieves median errors of 7.1m (single-node), 9.2m (multi-node) in an outdoor area of 140m × 100m, and 2.0m (single-node), 3.9m (multi-node) in an indoor area of 20m × 16m, which represent improvements of 1.1×, 2× and 1.5×, 1.7× compared to the baseline. Additionally, MC-LoRa can provide localization service for hundreds of LoRaWAN nodes with accuracy comparable to that of a state-of-the-art single-node system. Its wide localization range and high accuracy enable MC-LoRa to benefit a variety of applications, including asset tracking, navigation in vast indoor spaces (e.g., airports, warehouses and halls), and smart cities.
@article{10.1145/3712279,
author = {Hao, Han and Xi, Wei and Kuster, Andreas and Gamage, Amalinda and Xia, Xianjin},
title = {MC-LoRa: Multi-node Concurrent Localization for LoRaWAN Indoors and Outdoors},
year = {2025},
issue_date = {March 2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {9},
number = {1},
url = {https://doi.org/10.1145/3712279},
doi = {10.1145/3712279},
abstract = {Multi-node localization is crucial for large-scale and densely deployed Internet of Things (IoT) devices connected via LoRaWAN. Due to limitations in bandwidth and the number of RX chains (antennas), existing LoRaWAN-based localization methods often rely on frequency hopping or additional infrastructure to improve location accuracy. Although promising, these methods struggle to localize multiple nodes during packet collisions. In this paper, we propose MC-LoRa, which features a multi-node localization pipeline that includes reliable preamble detection under the near-far effect, tackling inter-symbol interference among multiple packets, and a virtual antenna array method to obtain extra channel state measurements within a single channel. This approach not only enhances angle resolution in our AoA-based system but also eliminates the need for time-consuming frequency hopping, requiring only software processing in existing gateways. Our extensive evaluation results show that MC-LoRa achieves median errors of 7.1m (single-node), 9.2m (multi-node) in an outdoor area of 140m \texttimes{} 100m, and 2.0m (single-node), 3.9m (multi-node) in an indoor area of 20m \texttimes{} 16m, which represent improvements of 1.1\texttimes{}, 2\texttimes{} and 1.5\texttimes{}, 1.7\texttimes{} compared to the baseline. Additionally, MC-LoRa can provide localization service for hundreds of LoRaWAN nodes with accuracy comparable to that of a state-of-the-art single-node system. Its wide localization range and high accuracy enable MC-LoRa to benefit a variety of applications, including asset tracking, navigation in vast indoor spaces (e.g., airports, warehouses and halls), and smart cities.},
journal = {Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.},
month = mar,
articleno = {4},
numpages = {32},
keywords = {AoA, LoRaWAN, concurrent localization}
}
Beyond the Noise: Innovating Information Verification in the Digital Age
Andreas Kuster
St. Gallen Symposium, Switzerland,
One of the three winners of the 53rd St.Gallen Symposium's Global Essay Competition,
May 2024
Physical memory protection is a hardware mechanism designed to prevent unauthorized access to specific memory regions, enabling the deployment of Trusted Execution Environments (TEEs). The RISC-V instruction set architecture specifies PMP for RISC-V cores but leaves other system bus masters as found in heterogeneous computing systems out of scope. This work presents Protego, an open-source I/O physical memory protection (IOPMP) unit based on the RISC-V PMP specification that extends PMP to other system bus masters. We demonstrate that Protego is effective in protecting sensitive data in memory and preventing unauthorized access at small hardware costs of below 40 kGE for a 64-bit system and negligible performance impact, making it a valuable tool for creating TEEs in heterogeneous computing systems.
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems
Johannes de Fine Licht
,
Andreas Kuster
,
Tiziano De Matteis
,
Tal Ben-Nun
,
Dominic Hofer
,
Torsten Hoefler
In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO'21),
May 2021
Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the general case of mapping directed acyclic graphs of heterogeneous stencil computations to spatial computing systems, assuming large input programs without an iterative component. StencilFlow maximizes temporal locality and ensures deadlock freedom in this setting, providing end-to-end analysis and mapping from a high-level program description to distributed hardware. We evaluate our generated architectures on a Stratix 10 FPGA testbed, yielding 1.31 TOp/s and 4.18 TOp/s on single-device and multi-device, respectively, demonstrating the highest performance recorded for stencil programs on FPGAs to date. We then leverage the framework to study a complex stencil program from a production weather simulation application. Our work enables productively targeting distributed spatial computing systems with large stencil programs, and offers insight into architecture characteristics required for their efficient execution in practice.
@inproceedings{stencilflow,
author = {Johannes de Fine Licht, Andreas Kuster, Tiziano De Matteis, Tal Ben-Nun, Dominic Hofer, Torsten Hoefler},
title = {StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems},
year = {2021},
booktitle = {Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO'21)},
series = {CGO '21},
}