SlideShare a Scribd company logo
© 2019 Arm Limited
Brent Gorda
September 25th, 2019
Arm in HPC
2 © 2019 Arm Limited
© 2019 Arm Limited
• Parkinson’s & Osteoporosis
Ongoing research in Bristol: New Drugs ‘In Silico’
Images courtesy of Bristol University
3 © 2019 Arm Limited
© 2019 Arm Limited
Multiphysics Simulations: Fluid Dynamics, Heat Diffusion, Electromagnetics
Images courtesy of Bristol University
4 © 2019 Arm Limited
© 2019 Arm Limited
What is “Super” or “High Performance” Computing?
Lake Tahoe ~40 Trillion Gallons of water (4.0x10^12)
~2002 Supercomputers hit 40 Teraflops (Earth Simulator – Japan/NEC)
5 © 2019 Arm Limited
© 2019 Arm Limited
What is “Super” or “High Performance” Computing?
The Great Lakes hold ~6.5 Quadrillion gallons of water (6.5x10^15)
2008 Supercomputers hit 1 Petaflop 1.0x10^15 (US IBM Roadrunner)
6 © 2019 Arm Limited
Top500 systems over the past 25 years
1.00E-01
1.00E+00
1.00E+01
1.00E+02
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018
59.7 GFlop/s
422 MFlop/s
1.17 TFlop/s
149 PFlop/s
1.01 PFlop/s
1.56 EFlop/s
1 Gflop/s
1 Tflop/s
100 Mflop/s
100 Gflop/s
100 Tflop/s
10 Gflop/s
10 Tflop/s
1 Pflop/s
100 Pflop/s
10 Pflop/s
1 Eflop/s
SUM
N=1
N=500
Astra HPE/ArmEarth Simulator NEC ‘02 Jaguar AMD ‘09
Images courtesy of www.top500.org
7 © 2019 Arm Limited
© 2019 Arm Limited
These are not embedded devices:
Images courtesy of Bristol University
8 © 2019 Arm Limited
© 2019 Arm Limited
Mont-Blanc
The “legacy” Mont-Blanc vision
Denver, Nov 13th 2017Arm HPC User Group2
Vision: to leverage the fast growing market of mobile technology for
scientific computation, HPC and data centers.
2012 2013 2014 20162015 2017 2018
Mont-Blanc 2
Mont-Blanc 3
Early Research into the Efficacy of Arm for HPC
9 © 2019 Arm Limited
© 2019 Arm Limited
Catalyst UK: Accelerating ARM Adoption in UK
Industry PartnersProgram Goals
Measures of SuccessConfigs & Timeline
– Deployment: Deployment of HPC clusters at
multiple UK sites, supported for 3-year period
providing access to academia & industry
– Adoption: Early adoption of ARM for HPC in UK;
Apollo 70 Early Ship followed by customer collab.
– Applications: Customer-driven porting and opt
– Collaboration: Leveraging the success “Project
Comanche” model of customer-centric
collaboration; but based instead on Early Ship
HPE Apollo 70 product
– Exascale: Establish foundation for Exascale collab
UK Collaborations
Intended outcomes include:
– Critical HPC apps ported and demonstrated
– ISV engagements and demonstrations
– Demonstrated performance improvements
– Publications and follow-on collaborations
– Bugs filed, fixed & up-streamed to open source
– HPE: Apollo 70, HPE Performance Software - Cluster Manager, HPE
Performance Software – Message Passing Interface
– ARM: Allinea Studio (Compiler, Libraries, Forge-DDT & MAP),
OpenHPC
– Mellanox: OFED, HPC-X, OpenMPI, OpenSHMEM, MXM, SHArP
– SuSE: SLES, OpenStack, HPC Module
– Cavium: ThunderX2 SoC, technical support
– Qualcomm: Centriq SoC, technical support (tentative)
– EPCC: WRF, OpenFOAM, Rolls
Royce Hydra opt, 2 PhD candidates
– Leicester: Data-intensive apps,
genomics, MOAB Torque, DiRAC
collab
– Bristol: VASP, CASTEP, Gromacs,
CP2K, Unified Model, Hydra, NAMD,
Oasis, NEMO, OpenIFS, CASINO,
LAMMPS
– UK Government: Dept. for Bus.,
Energy & Industrial Strategy (BEIS)
Typical for each site:
– 64 Apollo 70
– Compute Nodes:
– Cavium 32c, 2.2 GHz
– 256GB memory (16GB
DIMMs)
– IB EDR CX5 Clos
– 4096+ cores
– 6 CL4300 (tentative)
– Services/Storage:
– Qualcomm Centriq
Sep-Dec: Structure
partnership, alignment
Jan: HPE/ARM SOW
Feb: Customer SoWs,
quotations, POs
Mar: SW stack validation (3rd
Party Runtime library)
Apr: Systems build, public
announcements
May: Delivery and acceptance
HPE will deliver >12,000 cores across 3
sites; amongst the largest ARM HPC
deployments in the world
HPE Confidential
Catalyst UK
10 © 2019 Arm Limited
© 2019 Arm Limited
Isambard The World’s First Arm-based Production Supercomputer
11 © 2019 Arm Limited
© 2019 Arm Limited
Vanguard Astra by HPE: #156 on top500
• 2,592 HPE Apollo 70 compute nodes
• 5,184 CPUs, 145,152 cores, 2.3 PFLOPs (peak)
• Marvell ThunderX2 ARM SoC, 28 core, 2.0 GHz
• Memory per node: 128 GB (16 x 8 GB DR DIMMs)
• Aggregate capacity: 332 TB, 885 TB/s (peak)
• Mellanox IB EDR, ConnectX-5
• 112 36-port edges, 3 648-port spine
switches
• Red Hat RHEL for Arm
• HPE Apollo 4520 All–flash Lustre storage
• Storage Capacity: 403 TB (usable)
• Storage Bandwidth: 244 GB/s
12 © 2019 Arm Limited
© 2019 Arm Limited
Exascale – the race underway at the high end
Projected Exascale System Dates
U.S.
▪ Sustained ES*: 2022-2023
▪ Peak ES: 2021
▪ ES Vendors: U.S.
▪ Processors: U.S. (some ARM?)
▪ Cost: $500M-$600M per system
(for early systems), plus heavy
R&D investments
52
China
▪ Sustained ES*: 2021-2022
▪ Peak ES: 2020
▪ Vendors: Chinese (multiple sites)
▪ Processors: Chinese (plus U.S.?)
▪ 13th 5-Year Plan
▪ Cost: $350-$500M per system,
plus heavy R&D
EU
▪ PEAK ES: 2023-2024
▪ Pre-ES: 2020-2022 (~$125M)
▪ Vendors: US and then European
▪ Processors: x86, ARM & RISC-V
▪ Initiatives: EuroHPC, EPI, ETP4HPC, JU
▪ Cost: Over $300M per system, plus heavy
R&D investments
Japan
▪ Sustained ES*: ~2021/2022
▪ Peak ES: Likely as a AI/ML/DL system
▪ Vendors: Japanese
▪ Processors: Japanese ARM
▪ Cost: ~$1B, this includes both 1 system
and the R&D costs
▪ They will also do many smaller size
systems
* 1 exaflops on a 64-bit real application 52© Hyperion Research
13 © 2019 Arm Limited
© 2019 Arm Limited
Exascale - Fujitsu A64FX
14 © 2019 Arm Limited
© 2019 Arm Limited
Exascale – European Processor Initiative
GPP AND COMMON ARCHITECTURE
9
ZEUS MPPA
eFPGA
FPGA
FPGA
ZEUS ZEUS
ZEUS ZEUS EPAC
HBM
memories
DDR
memories
PCIe gen5
links
HSL
links
D2D links
to adjacent chiplets
EPAC - EPI Accelerator (TITAN)
MPPA - Multi-Purpose Processing Array
eFPGA - embedded FPGA
Cryptographic ASIC (EU Sovereignty)
15 © 2019 Arm Limited
Arm HPC Software Ecosystem
ClusterManagementTools:
Bright,HPECMU,xCat,Warewulf
Linux OS Distro of choice:
RHEL, SUSE, CENTOS,…
Arm Server Ready Platform:
Standard OS compatible FW and RAS features
HPC Applications:
Open-source, Owned, and Commercial ISV codes
Job schedulers
and Resource
Management:
SLURM, IBM LSF,
Altair PBS Pro,
etc.
Programming
Languages:
Fortran, C, C++
via
GNU, LLVM, Arm
& OEMs
Debug and
performance
analysis tools:
Arm Forge,
Rogue Wave,
TAU, etc.
Filesystems:
BeeGFS,
LUSTRE, ZFS,
HDFS, GPFS
App/ISA specific optimizations, optimized libs and intrinsics:
Arm PL, BLAS, FFTW, etc.
Communication Stacks and run-times:
Mellanox IB/OFED/HPC-X, OpenMPI, MPICH, MVAPICH2, OpenSHMEM, OpenUCX, HPE MPI
Parallelism
standards:
OpenMP
(omp / gomp),
MPI, SHMEM
(see below)
User-space
utilities,
scripting,
containers, and
other packages:
Singularity,
Openstack,
OpenHPC,
Python, NumPy,
SciPy, etc.
16 © 2019 Arm Limited
Porting HPC apps to the Arm platforms
Ø The platform just works – porting in 2 days is the common experience
Build recipes online at https://gitlab.com/arm-hpc/packages/wikis/home
LAMMPS CESM2 MrBayes Bowtie
AMBER Paraview SIESTA UMNAMD
VASP MILCWRF GEANT4
Quantum
ESPRESSO
DL-Poly NEMOGAMESSOpenFOAM VisIT
QMCPACKAbinitBLAST NWCHEM BWA
GROMACS
Chem/Phys
Weather
CFD
Visualization
Genomics
17 © 2019 Arm Limited
© 2018 Arm Limited
Arm in IOT
We design & license IP, we do not
manufacture chips
Partners build products for their
target markets
One size does not fit for all
HPC is a great fit for
co-design and collaboration
Partnership is key Choice is good
21 billion chips in the past year
Mobile/Embedded/IoT/
Automotive/GPUs
And now … servers
Arm Technology Connects the World
18 © 2019 Arm Limited
© 2019 Arm Limited
Edge
Edge
Critical Data
Massive Amounts of Data
z
z
Edge
5G
CORTEX
HPC
Cloud
Data Centers
The New Architecture
19 © 2019 Arm Limited
Confidential © 2019 Arm Limited
• Historically strong focus on high-end systems and balance:
• B:F Ratio’s of the late 1990’s thru 2010
• Parallel processing at massive scale
• Low-latency / high BW interconnects
• Citing: S/W maintenance, roll-out, cooling/power
• Workloads:
•Historical workloads scientific simulation
•Recent new workloads attracted to “high-end” capabilities of
HPC architectures: big data, Deep Learning/AI
•HPC Leads in technology acceptance (think Formula-1)
HPC is an excellent partner for the ecosystem
HPC is an Architecture
20 © 2019 Arm Limited
© 2019 Arm Limited
Arm is Data driven, from the edge to the core
The Cloud to Edge Infrastructure Foundation
for a World of 1T Intelligent Devices
Thank You
Arm.com/hpc

More Related Content

PDF
Arm in HPC
PDF
Arm as a Viable Architecture for HPC and AI
PDF
Hardware & Software Platforms for HPC, AI and ML
PDF
NNSA Explorations: ARM for Supercomputing
PDF
High Performance Interconnects: Assessment & Rankings
PDF
Exceeding the Limits of Air Cooling to Unlock Greater Potential in HPC
PDF
It's Time to ROCm!
PDF
SGI: Meeting Manufacturing's Need for Production Supercomputing
Arm in HPC
Arm as a Viable Architecture for HPC and AI
Hardware & Software Platforms for HPC, AI and ML
NNSA Explorations: ARM for Supercomputing
High Performance Interconnects: Assessment & Rankings
Exceeding the Limits of Air Cooling to Unlock Greater Potential in HPC
It's Time to ROCm!
SGI: Meeting Manufacturing's Need for Production Supercomputing

What's hot (20)

PDF
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
PDF
TAU E4S ON OpenPOWER /POWER9 platform
PDF
Phytium 64 core cpu preview
PDF
Energy Efficient Computing using Dynamic Tuning
PDF
Overview of HPC Interconnects
PDF
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
PDF
Japan's post K Computer
PDF
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
PDF
Introduction of Fujitsu's HPC Processor for the Post-K Computer
PDF
IBM HPC Transformation with AI
PDF
Deep Learning on the SaturnV Cluster
PDF
POWER10 innovations for HPC
PPTX
EMC in HPC – The Journey so far and the Road Ahead
PDF
POWER9 for AI & HPC
PPTX
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
PDF
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
PDF
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
PDF
HKG18-318 - OpenAMP Workshop
PDF
A Fresh Look at HPC from Huawei Enterprise
PPTX
AMD Bridges the X86 and ARM Ecosystems for the Data Center
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
TAU E4S ON OpenPOWER /POWER9 platform
Phytium 64 core cpu preview
Energy Efficient Computing using Dynamic Tuning
Overview of HPC Interconnects
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Japan's post K Computer
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Introduction of Fujitsu's HPC Processor for the Post-K Computer
IBM HPC Transformation with AI
Deep Learning on the SaturnV Cluster
POWER10 innovations for HPC
EMC in HPC – The Journey so far and the Road Ahead
POWER9 for AI & HPC
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
HKG18-318 - OpenAMP Workshop
A Fresh Look at HPC from Huawei Enterprise
AMD Bridges the X86 and ARM Ecosystems for the Data Center
 
Ad

Similar to An Update on Arm HPC (20)

PDF
OpenPOWER Acceleration of HPCC Systems
PDF
AMD It's Time to ROC
PPTX
Ceph on 64-bit ARM with X-Gene
PDF
Accelerate Big Data Processing with High-Performance Computing Technologies
PDF
LCU13: GPGPU on ARM Experience Report
PDF
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
PDF
UCX: An Open Source Framework for HPC Network APIs and Beyond
PDF
Deview 2013 rise of the wimpy machines - john mao
PPTX
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PPTX
How to get access to World Largest AI super Computer to do Advanced AI research
PDF
Real time machine learning proposers day v3
PDF
A Library for Emerging High-Performance Computing Clusters
PDF
Arm - ceph on arm update
PDF
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
PDF
Learn more about the tremendous value Open Data Plane brings to NFV
PPTX
EclipseOMRBuildingBlocks4Polyglot_TURBO18
PDF
Implementing AI: High Performace Architectures
 
PDF
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
PDF
Future Commodity Chip Called CELL for HPC
PDF
Exploring the Programming Models for the LUMI Supercomputer
OpenPOWER Acceleration of HPCC Systems
AMD It's Time to ROC
Ceph on 64-bit ARM with X-Gene
Accelerate Big Data Processing with High-Performance Computing Technologies
LCU13: GPGPU on ARM Experience Report
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
UCX: An Open Source Framework for HPC Network APIs and Beyond
Deview 2013 rise of the wimpy machines - john mao
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
How to get access to World Largest AI super Computer to do Advanced AI research
Real time machine learning proposers day v3
A Library for Emerging High-Performance Computing Clusters
Arm - ceph on arm update
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
Learn more about the tremendous value Open Data Plane brings to NFV
EclipseOMRBuildingBlocks4Polyglot_TURBO18
Implementing AI: High Performace Architectures
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Future Commodity Chip Called CELL for HPC
Exploring the Programming Models for the LUMI Supercomputer
Ad

More from inside-BigData.com (20)

PDF
Major Market Shifts in IT
PDF
Preparing to program Aurora at Exascale - Early experiences and future direct...
PPTX
Transforming Private 5G Networks
PDF
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
PDF
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
PDF
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
PDF
HPC Impact: EDA Telemetry Neural Networks
PDF
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
PDF
Machine Learning for Weather Forecasts
PPTX
HPC AI Advisory Council Update
PDF
Fugaku Supercomputer joins fight against COVID-19
PDF
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
PDF
State of ARM-based HPC
PDF
Versal Premium ACAP for Network and Cloud Acceleration
PDF
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
PDF
Scaling TCO in a Post Moore's Era
PDF
CUDA-Python and RAPIDS for blazing fast scientific computing
PDF
Introducing HPC with a Raspberry Pi Cluster
PDF
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
PDF
Data Parallel Deep Learning
Major Market Shifts in IT
Preparing to program Aurora at Exascale - Early experiences and future direct...
Transforming Private 5G Networks
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
HPC Impact: EDA Telemetry Neural Networks
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Machine Learning for Weather Forecasts
HPC AI Advisory Council Update
Fugaku Supercomputer joins fight against COVID-19
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
State of ARM-based HPC
Versal Premium ACAP for Network and Cloud Acceleration
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Scaling TCO in a Post Moore's Era
CUDA-Python and RAPIDS for blazing fast scientific computing
Introducing HPC with a Raspberry Pi Cluster
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Data Parallel Deep Learning

Recently uploaded (20)

PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Hybrid model detection and classification of lung cancer
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
August Patch Tuesday
PDF
Mushroom cultivation and it's methods.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Hybrid model detection and classification of lung cancer
Building Integrated photovoltaic BIPV_UPV.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Encapsulation_ Review paper, used for researhc scholars
1 - Historical Antecedents, Social Consideration.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
OMC Textile Division Presentation 2021.pptx
Tartificialntelligence_presentation.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
Web App vs Mobile App What Should You Build First.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Enhancing emotion recognition model for a student engagement use case through...
Heart disease approach using modified random forest and particle swarm optimi...
August Patch Tuesday
Mushroom cultivation and it's methods.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
Hindi spoken digit analysis for native and non-native speakers
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11

An Update on Arm HPC

  • 1. © 2019 Arm Limited Brent Gorda September 25th, 2019 Arm in HPC
  • 2. 2 © 2019 Arm Limited © 2019 Arm Limited • Parkinson’s & Osteoporosis Ongoing research in Bristol: New Drugs ‘In Silico’ Images courtesy of Bristol University
  • 3. 3 © 2019 Arm Limited © 2019 Arm Limited Multiphysics Simulations: Fluid Dynamics, Heat Diffusion, Electromagnetics Images courtesy of Bristol University
  • 4. 4 © 2019 Arm Limited © 2019 Arm Limited What is “Super” or “High Performance” Computing? Lake Tahoe ~40 Trillion Gallons of water (4.0x10^12) ~2002 Supercomputers hit 40 Teraflops (Earth Simulator – Japan/NEC)
  • 5. 5 © 2019 Arm Limited © 2019 Arm Limited What is “Super” or “High Performance” Computing? The Great Lakes hold ~6.5 Quadrillion gallons of water (6.5x10^15) 2008 Supercomputers hit 1 Petaflop 1.0x10^15 (US IBM Roadrunner)
  • 6. 6 © 2019 Arm Limited Top500 systems over the past 25 years 1.00E-01 1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08 1.00E+09 1.00E+10 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 59.7 GFlop/s 422 MFlop/s 1.17 TFlop/s 149 PFlop/s 1.01 PFlop/s 1.56 EFlop/s 1 Gflop/s 1 Tflop/s 100 Mflop/s 100 Gflop/s 100 Tflop/s 10 Gflop/s 10 Tflop/s 1 Pflop/s 100 Pflop/s 10 Pflop/s 1 Eflop/s SUM N=1 N=500 Astra HPE/ArmEarth Simulator NEC ‘02 Jaguar AMD ‘09 Images courtesy of www.top500.org
  • 7. 7 © 2019 Arm Limited © 2019 Arm Limited These are not embedded devices: Images courtesy of Bristol University
  • 8. 8 © 2019 Arm Limited © 2019 Arm Limited Mont-Blanc The “legacy” Mont-Blanc vision Denver, Nov 13th 2017Arm HPC User Group2 Vision: to leverage the fast growing market of mobile technology for scientific computation, HPC and data centers. 2012 2013 2014 20162015 2017 2018 Mont-Blanc 2 Mont-Blanc 3 Early Research into the Efficacy of Arm for HPC
  • 9. 9 © 2019 Arm Limited © 2019 Arm Limited Catalyst UK: Accelerating ARM Adoption in UK Industry PartnersProgram Goals Measures of SuccessConfigs & Timeline – Deployment: Deployment of HPC clusters at multiple UK sites, supported for 3-year period providing access to academia & industry – Adoption: Early adoption of ARM for HPC in UK; Apollo 70 Early Ship followed by customer collab. – Applications: Customer-driven porting and opt – Collaboration: Leveraging the success “Project Comanche” model of customer-centric collaboration; but based instead on Early Ship HPE Apollo 70 product – Exascale: Establish foundation for Exascale collab UK Collaborations Intended outcomes include: – Critical HPC apps ported and demonstrated – ISV engagements and demonstrations – Demonstrated performance improvements – Publications and follow-on collaborations – Bugs filed, fixed & up-streamed to open source – HPE: Apollo 70, HPE Performance Software - Cluster Manager, HPE Performance Software – Message Passing Interface – ARM: Allinea Studio (Compiler, Libraries, Forge-DDT & MAP), OpenHPC – Mellanox: OFED, HPC-X, OpenMPI, OpenSHMEM, MXM, SHArP – SuSE: SLES, OpenStack, HPC Module – Cavium: ThunderX2 SoC, technical support – Qualcomm: Centriq SoC, technical support (tentative) – EPCC: WRF, OpenFOAM, Rolls Royce Hydra opt, 2 PhD candidates – Leicester: Data-intensive apps, genomics, MOAB Torque, DiRAC collab – Bristol: VASP, CASTEP, Gromacs, CP2K, Unified Model, Hydra, NAMD, Oasis, NEMO, OpenIFS, CASINO, LAMMPS – UK Government: Dept. for Bus., Energy & Industrial Strategy (BEIS) Typical for each site: – 64 Apollo 70 – Compute Nodes: – Cavium 32c, 2.2 GHz – 256GB memory (16GB DIMMs) – IB EDR CX5 Clos – 4096+ cores – 6 CL4300 (tentative) – Services/Storage: – Qualcomm Centriq Sep-Dec: Structure partnership, alignment Jan: HPE/ARM SOW Feb: Customer SoWs, quotations, POs Mar: SW stack validation (3rd Party Runtime library) Apr: Systems build, public announcements May: Delivery and acceptance HPE will deliver >12,000 cores across 3 sites; amongst the largest ARM HPC deployments in the world HPE Confidential Catalyst UK
  • 10. 10 © 2019 Arm Limited © 2019 Arm Limited Isambard The World’s First Arm-based Production Supercomputer
  • 11. 11 © 2019 Arm Limited © 2019 Arm Limited Vanguard Astra by HPE: #156 on top500 • 2,592 HPE Apollo 70 compute nodes • 5,184 CPUs, 145,152 cores, 2.3 PFLOPs (peak) • Marvell ThunderX2 ARM SoC, 28 core, 2.0 GHz • Memory per node: 128 GB (16 x 8 GB DR DIMMs) • Aggregate capacity: 332 TB, 885 TB/s (peak) • Mellanox IB EDR, ConnectX-5 • 112 36-port edges, 3 648-port spine switches • Red Hat RHEL for Arm • HPE Apollo 4520 All–flash Lustre storage • Storage Capacity: 403 TB (usable) • Storage Bandwidth: 244 GB/s
  • 12. 12 © 2019 Arm Limited © 2019 Arm Limited Exascale – the race underway at the high end Projected Exascale System Dates U.S. ▪ Sustained ES*: 2022-2023 ▪ Peak ES: 2021 ▪ ES Vendors: U.S. ▪ Processors: U.S. (some ARM?) ▪ Cost: $500M-$600M per system (for early systems), plus heavy R&D investments 52 China ▪ Sustained ES*: 2021-2022 ▪ Peak ES: 2020 ▪ Vendors: Chinese (multiple sites) ▪ Processors: Chinese (plus U.S.?) ▪ 13th 5-Year Plan ▪ Cost: $350-$500M per system, plus heavy R&D EU ▪ PEAK ES: 2023-2024 ▪ Pre-ES: 2020-2022 (~$125M) ▪ Vendors: US and then European ▪ Processors: x86, ARM & RISC-V ▪ Initiatives: EuroHPC, EPI, ETP4HPC, JU ▪ Cost: Over $300M per system, plus heavy R&D investments Japan ▪ Sustained ES*: ~2021/2022 ▪ Peak ES: Likely as a AI/ML/DL system ▪ Vendors: Japanese ▪ Processors: Japanese ARM ▪ Cost: ~$1B, this includes both 1 system and the R&D costs ▪ They will also do many smaller size systems * 1 exaflops on a 64-bit real application 52© Hyperion Research
  • 13. 13 © 2019 Arm Limited © 2019 Arm Limited Exascale - Fujitsu A64FX
  • 14. 14 © 2019 Arm Limited © 2019 Arm Limited Exascale – European Processor Initiative GPP AND COMMON ARCHITECTURE 9 ZEUS MPPA eFPGA FPGA FPGA ZEUS ZEUS ZEUS ZEUS EPAC HBM memories DDR memories PCIe gen5 links HSL links D2D links to adjacent chiplets EPAC - EPI Accelerator (TITAN) MPPA - Multi-Purpose Processing Array eFPGA - embedded FPGA Cryptographic ASIC (EU Sovereignty)
  • 15. 15 © 2019 Arm Limited Arm HPC Software Ecosystem ClusterManagementTools: Bright,HPECMU,xCat,Warewulf Linux OS Distro of choice: RHEL, SUSE, CENTOS,… Arm Server Ready Platform: Standard OS compatible FW and RAS features HPC Applications: Open-source, Owned, and Commercial ISV codes Job schedulers and Resource Management: SLURM, IBM LSF, Altair PBS Pro, etc. Programming Languages: Fortran, C, C++ via GNU, LLVM, Arm & OEMs Debug and performance analysis tools: Arm Forge, Rogue Wave, TAU, etc. Filesystems: BeeGFS, LUSTRE, ZFS, HDFS, GPFS App/ISA specific optimizations, optimized libs and intrinsics: Arm PL, BLAS, FFTW, etc. Communication Stacks and run-times: Mellanox IB/OFED/HPC-X, OpenMPI, MPICH, MVAPICH2, OpenSHMEM, OpenUCX, HPE MPI Parallelism standards: OpenMP (omp / gomp), MPI, SHMEM (see below) User-space utilities, scripting, containers, and other packages: Singularity, Openstack, OpenHPC, Python, NumPy, SciPy, etc.
  • 16. 16 © 2019 Arm Limited Porting HPC apps to the Arm platforms Ø The platform just works – porting in 2 days is the common experience Build recipes online at https://gitlab.com/arm-hpc/packages/wikis/home LAMMPS CESM2 MrBayes Bowtie AMBER Paraview SIESTA UMNAMD VASP MILCWRF GEANT4 Quantum ESPRESSO DL-Poly NEMOGAMESSOpenFOAM VisIT QMCPACKAbinitBLAST NWCHEM BWA GROMACS Chem/Phys Weather CFD Visualization Genomics
  • 17. 17 © 2019 Arm Limited © 2018 Arm Limited Arm in IOT We design & license IP, we do not manufacture chips Partners build products for their target markets One size does not fit for all HPC is a great fit for co-design and collaboration Partnership is key Choice is good 21 billion chips in the past year Mobile/Embedded/IoT/ Automotive/GPUs And now … servers Arm Technology Connects the World
  • 18. 18 © 2019 Arm Limited © 2019 Arm Limited Edge Edge Critical Data Massive Amounts of Data z z Edge 5G CORTEX HPC Cloud Data Centers The New Architecture
  • 19. 19 © 2019 Arm Limited Confidential © 2019 Arm Limited • Historically strong focus on high-end systems and balance: • B:F Ratio’s of the late 1990’s thru 2010 • Parallel processing at massive scale • Low-latency / high BW interconnects • Citing: S/W maintenance, roll-out, cooling/power • Workloads: •Historical workloads scientific simulation •Recent new workloads attracted to “high-end” capabilities of HPC architectures: big data, Deep Learning/AI •HPC Leads in technology acceptance (think Formula-1) HPC is an excellent partner for the ecosystem HPC is an Architecture
  • 20. 20 © 2019 Arm Limited © 2019 Arm Limited Arm is Data driven, from the edge to the core
  • 21. The Cloud to Edge Infrastructure Foundation for a World of 1T Intelligent Devices Thank You Arm.com/hpc