The RISC-V METASAT Multicore
and GPU Space Platform and
its Software Stack
MODULAR MODEL-BASED DESIGN AND
TESTING FOR APPLICATIONS IN SATELLITES
Dr. Leonidas Kosmidis
1
May 8th 2024
METASAT
• 2-year Horizon Europe project: January 2023-December 2024
• TRL 3-4
2 © 2023 Consortium Confidential
Introduction
• Modern and upcoming space systems require increasing levels of
computing power
• Existing space processors cannot provide this performance level
• Need for higher performance hardware in space systems
3 © 2023 Consortium Confidential
METASAT Overview
• Modern aerospace systems require new, advanced functionalities
• Artificial Intelligence (AI)
• High Resolution Sensors
• Optical communications
• Advanced Robotics…
• Advanced functionalities require complex hardware and software
compared to the existing space technologies
• High Performance Hardware technologies: Advanced Multi-cores,
GPUs, AI accelerators
• Programming high performance hardware requires complex software:
parallel and GPU programming
4 © 2023 Consortium Confidential
Model-Based Design
• Model-Based Design can reduce the development and verification
time for these complex platforms
• Development can be assisted by high level design methods (models)
from which code can be automatically generated
• Correct-by-construction
• Various levels of verification: model-in-the-loop, software-in-the-loop,
processor-in-the-loop etc
• Virtual platforms allow starting software development before the hardware is
ready
• Break the dependency between hardware and software development
5 © 2023 Consortium Confidential
Hardware Selection
• No hardware with such architectural complexity exists for the space
domain
• COTS Embedded Multicore and GPU devices provide these features
but depend on non-qualifiable software stacks
• GPU drivers available only for Linux
• Blocking point for use in institutional missions
• Design a prototype hardware platform based on the RISC-V ISA
6 © 2023 Consortium Confidential
Virtualisation
• Time and Space isolation provide benefits for faster and easier
integration
• Components can be developed and tested in isolation
• Fault Detection, Isolation and Recovery (FDIR)
• XtratuM NG hypervisor by fentISS
7 © 2023 Consortium Confidential
METASAT
• METASAT relies on open source and standardized technologies
• Maximise interoperability and avoid vendor lock-in
• Facilitate the development of a space ecosystem
• ESA’s TASTE Open Source Model Based Design framework,
enhanced with code generation for high performance platforms
such as GPUs
• Open Source Processor technologies such as Gaisler’s NOEL-V
RISC-V processors
• Enhancement with AI processing acceleration capabilities
8 © 2023 Consortium Confidential
The METASAT RISC-V Platform
• Mixed Criticality Platform
• FPGA Prototype on a Xilinx VCU118
• Multicore CPU Based on NOEL-V +
SPARROW AI SIMD Accelerator
• Qualifiable software stack for high
criticality software with moderate AI
acceleration needs
9 © 2023 Consortium Confidential
The METASAT RISC-V Platform
• SPARROW AI SIMD Accelerator
• High-performance, Low-cost at least
30% smaller than conventional vector
processors with similar performance
• Minimal core modifications
• incremental qualification
• Key features: reuse of integer register
file, short SIMD unit (8-bit), swizzling,
reductions
• Intrinsics-like software support similar
to ARM’s NEON
10 © 2023 Consortium Confidential
The METASAT RISC-V Platform
• Mixed Criticality Platform
• FPGA Prototype on a Xilinx VCU 118
• Configurable Vortex RISC-V GPU
• Enhancements for real-time execution and
reliability
• Qualifiable software stack for tasks
requiring very high performance
• Enable the use of GPUs from bare metal,
or RTOS
• Share the GPU among partitions
• The hardware platform will be open
sourced as well as much of its
software
11 © 2023 Consortium Confidential
The METASAT RISC-V Platform: Current Status
▪ FPGA Resource utilisation
▪ Current configuration:
▪ 4 NOEL-V high performance, Dual-Issue, FPLite, Hypervisor support +
with 2 SPARROW accelerators each
▪ L2 cache L2Lite
▪ GPU: 64bit 2 CUs, 4 threads each
▪ GRETH Ethernet
▪ UART
▪ A design space exploration will be performed to find the best
configuration for the project use cases
12 © 2023 Consortium Confidential
FPGA Utilization
Vortex GPU
Multicore CPU
L2 Cache
Memory
Controller SPARROW
Core 1
AXI-Lite Core 2
CU 1 CU 2 Units
Controller
Data Memory
Cache Controller
Instr.
Cache Core 3 Core 4
13 © 2023 Consortium Confidential
The METASAT RISC-V Platform: Current Status
▪ Able to run OpenMP programs on both FPGA
and QEMU under RTEMS, with and without
XtratuM
▪ See [1] for performance results and
comparison with other multicore
architectures
▪ SPARROW support in RTEMS and XtratuM
▪ RTEMS Compiler modifications
▪ Support for SPARROW control register to
added to RTEMS
▪ TensorFlow Lite Support
[1] M. Solé, J. Wolf, I. Rodriguez, A. Jover, M. M. Trompouki, L Kosmidis, D.
Steenari. Evaluation of the Multicore Performance Capabilities of the Next
14
Generation Flight Computers. Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Performance evaluation
▪ Matrix multiplication in RTEMS + OpenMP (4 cores)
▪ SPARROW on a single core NOEL-V provides higher performance than a 4-core
OpenMP implementation
▪ 20x overall speedup by using both multicore and SPARROW
400
x20.5
350
x20.1
300
250
MOPS
200
150
x5.8 x5.6
100
x3.8 x4
50
x1 x1
0
1024x1024 2048x2048
1 Thread 4 Threads SPARROW 1 Thread SPARROW 4 Threads
15 © 2023 Consortium Confidential
The METASAT RISC-V Platform: Current Status
▪ NOEL-V Integration with Vortex GPU
▪ AXI interface added to Vortex
▪ Able to offload a GPU kernel in bare-metal, RTEMS and under the XtratuM hypervisor
▪ Established programming methodology for using the GPU in the METASAT platform
▪ Precompile the GPU kernel
▪ Common practice in safety critical systems (Khronos OpenGL SC, Vulkan SC)
▪ Embed the kernel binary in the program executable
▪ No filesystem
▪ Linker script modifications
▪ Supported GPU APIs:
▪ Vortex GPU API
▪ On going work on Brook Auto[1]/BRASIL[2], OpenCL and OpenGL SC 2.0
▪ CUDA code generation from Matlab/Simulink and CUDA→Brook Auto translator
[1] Trompouki and Kosmidis, Brook Auto: High-Level Certification-Friendly Programming for GPU-powered Automotive Systems
[DAC’18], https://github.com/lkosmid/brook
16 © 2023 Consortium Confidential
[2] Trompouki and Kosmidis, BRASIL: A High-Integrity GPGPU Toolchain for Automotive Systems [ICCD’19]
The METASAT Accelerators
Multicore GPU
CPU CPU CU CU
L2 L3 L2
CPU CPU
CU CU
17 © 2023 Consortium Confidential
The METASAT RISC-V Platform
▪ METASAT multicore CPU modeled in QEMU
▪ On-going support for SPARROW
▪ Vortex GPU is simulated in Verilator
▪ Cycle-accurate behavioural simulation
▪ SystemVerilog to SystemC/C++
▪ Work started with GR740 and GR712RC models
▪ Accurate modeling of the GR740 GRFPU
▪ Cannot handle subnormal numbers as input,
raise unfinished_Fpop exception
▪ Handling denormalized numbers with the GRFPU
▪ https://www.gaisler.com/doc/antn/GRLIB-AN-
0007.pdf
▪ Can boot unmodified Linux and RTEMS
binaries
18 © 2023 Consortium Confidential
The METASAT RISC-V Platform
▪ NOEL-V model in QEMU
▪ Can boot unmodified Linux and RTEMS binaries
▪ METASAT QEMU Model
▪ Partial SPARROW support
▪ L2-lite cache controller
▪ Preliminary release and tutorial:
▪ https://gitlab.bsc.es/metasat-public/sparrow-tutorial
▪ Matrix multiplication (GEMM)
▪ Accelerated GEMM (gemmAcc.c)
▪ Modify it to use SPARROW
▪ SPARROW version in solution.c
▪ Binary release – tested in Ubuntu 22.04
▪ Docker container
19 © 2023 Consortium Confidential
RISC-V Support in TASTE
RISC-V support added in TASTE
• RTEMS
• RISC-V compiler
• QEMU RISC-V Simulator
Minimal communication example:
20 © 2023 Consortium Confidential
Project Use Cases
• 3 Project Use cases will be implemented
• OHB/DLR Use Case
• Hardware interlocking
• Protect against wrong software behaviour
• Implement interlocks at software level instead
of hardware
• Reduce cost
• Implement AI Based FDIR
• To be accelerated on the CPU using the
SPARROW AI accelerator
• Housekeeping data from ENMAP
21 © 2023 Consortium Confidential
Project Use Cases
• 2 BSC provided use cases based on ESA’s OBPMark-ML Open Source
Benchmarking suite: https://obpmark.github.io
• Cloud screening
• Ship Detection
• To be executed on the GPU
22 • Check our FSW 2024 pre-recorded video with cloud screening on ESA’s OPSAT
© 2023 Consortium Confidential
Conclusion
▪ METASAT achieves a major milestone towards the use of GPUs and high performance
platforms in space
▪ Provides an open source reference hardware platform
▪ FPGA and virtual
▪ Possible starting point for a future GPU tape out on a radiation tolerant/hardened
process
▪ Solves key limitations preventing GPUs to be adopted today in institutional missions
▪ Qualifiable software stack
▪ Promising early results
▪ Evaluation to be performed with relevant space use cases
23 © 2023 Consortium Confidential
https://metasat-project.eu/
info@metasat-project.eu
https://twitter.com/MetasatProject https://www.linkedin.com/company/metasat-project
METASAT has received funding from the European Union's Horizon
24 Europe programme under grant agreement number 101082622. © 2023 Consortium Confidential
Evaluation: Matrix Multiplication from [1]
8000
1024 2048 4096
7000
6000
5000
M(FL)OPS
4000
3000
2000
1000 25 97 175
13
25 97 44 154 33 13 65 64
0 39
float omp
float omp
float omp
float omp
float omp
float omp
float omp
float omp
float
int8
float
int8
float
int8
float
int8
float
int8
float
int8
float
int8
float
int8
int8 omp
int8 omp
int8 omp
int8 omp
int8 omp
int8 omp
int8 omp
int8 omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
25 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: 2D Correlation from [1]
1800
1600 1024 2048 4096
1400
1200
MFLOPS
1000
800
600
400
200 39 5659
0 39 4 4 55
float float float float float float float float float float float float float float float float
omp omp omp omp omp omp omp omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
26 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: Sliding FFT from [1]
1600
1400 1024 2048 4096
1200
MFLOPS
1000
800
600
400 216 215
200 5555 2727
0 7 7
float float float float float float float float float float float float float float float float
omp omp omp omp omp omp omp omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
27 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: FIR from [1]
300
1024 2048 4096
250
200
MFLOPS
150
100
50
88 5
11 5
0
float float float float float float float float float float float float float float float float
omp omp omp omp omp omp omp omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
28 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: 2D Convolution from [1]
800
700 1024 2048 4096
600
500
MFLOPS
400
300
200
100 1617
19 19
5 5
0
float float float float float float float float float float float float float float float float
omp omp omp omp omp omp omp omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
29 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: CIFAR-10 Inference from [1]
8000
1024 2048 4096
7000
6000
5000
FPS
4000
3000
2000
1000 215 32 118
215 121
0 32
float float float float float float float float float float float float float float float float
omp omp omp omp omp omp omp omp
23W 7.5W 15W 10W 15W 15W 5.3W
ZCU102 TX2 Xavier V1605B METASAT
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
30 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential
Evaluation: SIMD Matrix Multiplication [1]
10000 400
1024 2048 4096 1024 2048
9000 350
8000
300
7000
250
6000
MOPS
MOPS
200
5000
4000 150
3000 100
2000 50
1000 0
int32 omp
int32
int8
int8
SPARROW
SPARROW omp
int8 omp
int8 omp
0
int8 int8 Neon int8 int8 Neon int8 int8 Neon int8 int8 Neon
omp Int8 omp Int8 omp Int8 omp Int8
[1] M. Solé et al. Evaluation of the Multicore Performance Capabilities of the Next Generation Flight Computers.
31 Digital Avionics Systems Conference (DASC) 2023 © 2023 Consortium Confidential