0% found this document useful (0 votes)
145 views10 pages

Epyc and Ansys

EPYC AMD Performance
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views10 pages

Epyc and Ansys

EPYC AMD Performance
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

AMD and ANSYS® Fluent®

August 2018
Powering the Future of HPC

Exceptional Memory Bandwidth AMD EPYC™: The right choice for


ANSYS® Fluent® is a memory-intensive Computational Fluid Dynamics
workload that benefits from AMD EPYC’s
8 channels of memory bandwidth and up
to 2TB of memory per processor. Designed from the ground up for a new generation of solutions,
AMD EPYC implements a philosophy of choice without restriction.
Standards Based Choose the number of cores and sockets that meet your needs
AMD is committed to industry standards, without sacrificing key features like memory and I/O.
offering you a choice in x86 architecture
with design innovations that target the Each EPYC processor can have from 8 to 32 cores with access to
evolving needs of modern datacenters. incredible amounts of I/O and memory regardless of the number of
High Density, Low Cost cores in use, including 128 PCIe lanes, and access to 2 TB of high
Compute requirements are increasing,
speed memory per socket.
datacenter space is not. AMD’s EPYC
processor offers high core density with
full access to all features. Innovative
architecture means outstanding
performance at a low cost.
Partner Ecosystem
AMD’s broad partner ecosystem and
collaborative engineering provide tested
and validated solutions that help lower
your risk and your total cost of
ownership EPYC’s innovative architecture translates to terrific performance at
a low cost. More importantly, the performance you’re paying for is
ANSYS Fluent: Simulating complex
real-world problems
appropriate to the performance you need.

ANSYS Fluent is a general-purpose I/O intensive workloads can utilize the plentiful I/O bandwidth with
computational fluid dynamics (CFD) and the right number of cores (avoiding overpaying for unneeded
multi-physics tool that empowers you to power), while compute-intensive workloads can make use of fully
go further and faster as you optimize
your product’s performance.
loaded core counts, dual sockets and plenty of memory.

AMD EPYC processors help enable more performance, flexibility, and security
PERFORMANCE. AMD EPYC processors bring a new balance to the datacenter. Utilizing an x86 architecture, the AMD EPYC processor, brings
together high core counts, large memory capacity, ample memory bandwidth and massive I/O with the right ratios to help performance reach
new heights.
FLEXIBILITY. Match core count with application needs without compromising processor features. EPYC’s balanced set of resources means
more freedom to right-size the server configuration to the workload.
SECURITY. AMD EPYC features the industry’s first dedicated security processor embedded in an x86-architecture server processor. The
processor manages secure boot, memory encryption, and secure virtualization on the processor itself. Encryption keys never leave the
processor where they can be exposed to intruders.
SCALABILITY. Scale-up or scale-out, AMD and its ecosystem partners offer high-performance network connectivity options for applications at
massive scale.

2018 © Advanced Micro Devices, Inc.


AMD EPYC for Computational Fluid ANSYS Fluent
Dynamics ANSYS Fluent is a general-purpose
Memory bandwidth is a critical factor in computational fluid dynamics (CFD) and multi-
maximizing performance of Computational physics tool that empowers you to go further
Fluid Dynamics (CFD) workloads. AMD EPYC and faster as you optimize your product’s
server processors’ exceptional memory performance.
bandwidth ensures that you get the most out of Fluent software contains the broad physical
your system, minimizing execution time and modeling capabilities needed to model flow,
increasing overall utilization of your turbulence, heat transfer, and reactions for
deployment. industrial applications—ranging from air flow
over an aircraft wing to combustion in a
The EPYC Advantage: AMD EPYC server processors furnace, from bubble columns to oil platforms,
offer 8 memory channels of DDR4-2667 and up to from blood flow to semiconductor
2TB of memory per processor, yielding exceptional manufacturing, and from clean room design to
memory bandwidth and capacity. wastewater treatment plants.

Many High-Performance Compute (HPC) Fluent covers a broad reach, including special
workloads require you to balance performance models with capabilities to model in-cylinder
vs per-core license costs to manage your overall combustion, aero-acoustics, turbomachinery
cost. AMD EPYC processors offer a consistent and multiphase systems.
set of features across the product line, allowing AMD and ANSYS have continued their
users to optimize the number of cores required partnership to deliver exceptional performance
for their workloads without sacrificing features, for customers.
memory channels, memory capacity, or I/O
lanes. Whether you need 8, 16, 24, or 32
physical cores per socket, you will have access
to 8 channels of memory per processor across
all EPYC server processors.

The EPYC Advantage: Performance - The AMD


EPYC processor brings new balance to the
datacenter. The highest core count yet in an AMD
x86-architecture server processor, large memory
capacity, memory bandwidth and I/O density are all
brought together with the right ratios to help
performance reach new heights.

As workloads demand more processor cores,


the communications between processor cores Courtesy of ANSYS, Inc.
becomes critical to efficiently solving the
complex problems faced by customers. As The EPYC Advantage: Collaboration between AMD
cluster sizes increase, communication and ANSYS offers high performance and scalability
requirements between nodes rises quickly and for Computational Fluid Dynamics (CFD) workloads.
can limit scaling at large node counts. AMD and Customers across automotive, aerospace, consumer
goods, construction, defense, energy, healthcare,
ANSYS have collaborated to offer solutions for
industrial equipment and rotating machinery, and
CFD workloads enabling exceptional bioengineering industries can benefit from the
performance and low implementation costs. combination of AMD and ANSYS.

2018 © Advanced Micro Devices, Inc. 2


Performance Benchmarks and Testing Tested Hardware/Software configuration
The ANSYS Fluent benchmark suite provides ANSYS Compute Nodes
Fluent hardware performance data measured using CPUs 2 x EPYC 7351 -OR- 2x EPYC 7451
sets of benchmark problems selected to represent Cores 16 per socket / 32 cores per node
typical usage. The ANSYS Fluent benchmark cases Memory 256GB Dual-Rank DDR4-2666
range in size from a few hundred-thousand cells to NIC Mellanox ConnectX-5 EDR 100Gb
more than 100 million cells. The suite contains both Infiniband x16 PCIe
pressure-based (segregated and coupled) and Storage: OS 1 x 256 GB NVMe
density-based implicit solver cases using a variety of Storage: Data 1 x 1 TB NVMe
cell types and a range of physics. Software
OS RHEL 7.5 (3.10.0-862.el7.x86_64)
This document focuses on 5 of the largest models in Mellanox MLNX_OFED_LINUX-4.3-3.0.2.1
the Release 18.0 test cases, ranging from 15 million OFED Driver (OFED-4.3-3.0.2)
cells up to 280 million cells per model. Intel Xeon MPI Version Platform MPI
(platform_mpi-09.01.04.03)
E5-2695v4 results shown here come from
Application ANSYS Fluent 19.1
performance results posted at the link below. More
information on the benchmark suite can also be Network
Switch Mellanox EDR 100Gb/s Managed
found at this link
Switch (MSB7800-ES2F)
https://www.ansys.com/solutions/solutions-by- Configuration Options
role/it-professionals/platform- BIOS Setting SMT=OFF, Boost=ON,
support/benchmarks-overview/ansys-fluent- Determinism Slider = Power
benchmarks OS Settings Transparent Huge Pages=ON (Default),
Swappiness=0, Governor=Performance
AMD EPYC Testing was performed on a 32-node
cluster of dual-socket 7351 and 7451 processors.
Each EPYC 7351 processor has 16 cores with a base
frequency of 2.4 GHz and a boost frequency of 2.9
GHz. Each 7451 processor has 24 cores with a base
frequency of 2.3 GHz and boost of 3.2 GHz. Each
system has a total of 16 channels of dual-rank DDR4-
2666 memory, 8 channels per processor.

2018 © Advanced Micro Devices, Inc. 3


ANSYS Fluent Performance: Boeing Landing Gear Analysis (landing_gear_15m)
The 6th largest model in the suite performs an analysis of Boeing landing gear and has around 15 million
Mixed cells. It uses the realizable LES, Acoustics model and the Pressure based coupled solver, Least Squares
cell based, Unsteady solver.

The results below show scaling up to 32 nodes of dual-socket EPYC 7351 (16-cores per socket), dual-socket
EPYC 7451 (24-cores per socket), and dual-socket Intel® Xeon® E5-2695v4 (18-cores per socket). With 4
less cores per node, the EPYC 7351 processor posts up to 46% performance advantage over the Intel Xeon
E5-2695v4, and an average across all node counts of 32% advantage. This translates into a per-core
performance advantage of 22% to 65% for EPCY 7351, with an average of 48%.

The EPYC 7451 maintains between 54% and 86% performance leadership over the Intel Xeon E5-2695v4 up
through 16 nodes. At 32 nodes on this smaller model, the EPYC 7451 posts a modest 27% performance
advantage.

ANSYS Fluent - Vehicle Exhaust Model (7351-landing_gear_15m)


EPYC 7351 (16 cores) vs. EPYC 7451 (24 cores) vs. Xeon 2695v4 (18 cores)1
8000

32 Nodes
7000

6000
Core Solver Rating (higher is better)

5000
16 Nodes

4000

3000
8 Nodes

2000
4 Nodes

1000 2 Nodes
1 Node
0
0 5 10 15 20 25 30 35
# of Nodes

EPYC 7451 (24 cores) EPYC 7351 (16 cores) Xeon 2695v4 (18 cores)

2018 © Advanced Micro Devices, Inc. 4


ANSYS Fluent Performance: Vehicle Exhaust model (exhaust_system_33m)
This case is the 4th largest model in the suite and has around 33 million Mixed cells. It uses the SST K-omega
Turbulence model and the Pressure based coupled solver, Least Squares cell based, steady solver.
The results below show scaling up to 32 nodes of dual-socket EPYC 7351 (16-cores). In overall performance,
the EPYC 7351 processor maintains between 22% and 42% performance advantage over the Intel Xeon E5-
2695v4. Per-core performance maintained a solid performance advantage of 38% to 60% EPYC 7351.
The EPYC 7451 maintains between 63% and 79% performance leadership over the Intel Xeon E5-2695v4 up
through 16 nodes. Again, the smaller model posted a more modest, but still impressive, scaling at 32 nodes
of 36% performance advantage for the EPYC 7451.

ANSYS Fluent - Vehicle Exhaust Model (7351-exhaust_system_33m)


EPYC 7351 (16 cores) vs. EPYC 7451 (24 cores) vs. Xeon 2695v4 (18 cores)1
9000

32 Nodes
8000

7000
Core Solver Rating (higher is better)

6000

5000 16 Nodes

4000

3000 8 Nodes

2000
4 Nodes
1000 2 Nodes
1 Node
0
0 5 10 15 20 25 30 35
# of Nodes

EPYC 7451 (24 cores) EPYC 7351 (16 cores) Xeon 2695v4 (18 cores)

2018 © Advanced Micro Devices, Inc. 5


ANSYS Fluent Performance: Flow through a Combustor (combustor_71m)
This case has around 71 million Hex-core cells and uses the LES, Species Non-Premixed Combustion, PDF,
DPM model and the Pressure based coupled solver, Least Squares cell based, Unsteady solver. Note that as
the models get larger, they are harder to fit into the memory of a smaller number of nodes. The posted
results of the Intel Xeon E5-2695v4 testing only include results for 4 nodes and higher for combustor_71m.

The EPYC 7451 maintains a 69% to 78% performance advantage across 4, 8, 16, and 32-node configurations
over the Intel Xeon E5-2695v4, while the 7351 maintains a 38-45% lead per node and a 55% to 63% lead on
a per-core basis.

ANSYS Fluent - Flow through a Combustor (7351-combustor_71m)


EPYC 7351 (16 cores) vs. EPYC 7451 (24 cores) vs. Xeon 2695v4 (18 cores)1
800
32 Nodes

700

600
Core Solver Rating (higher is better)

500

16 Nodes
400

300

8 Nodes
200

4 Nodes
100
2 Nodes
1 Node

0
0 5 10 15 20 25 30 35
# of Nodes

EPYC 7451 (24 cores) EPYC 7351 (16 cores) Xeon 2695v4 (18 cores)

2018 © Advanced Micro Devices, Inc. 6


ANSYS Fluent Performance: External Flow over a Formula-1 Race Car (f1_racecar_140m)
This case has around 140 million Hex-core cells and uses the realizable k-ε turbulence model and the
Pressure based coupled solver, Least Squares cell based, pseudo transient solver. Due to the size of this
model, only 8 nodes and above where posted for the Intel Xeon E5-2695v4 for comparison.

Again, the EPYC 7451 maintains a commanding lead between 68% and 76% over the Intel Xeon E5-2695v4,
while the EPYC 7351 holds a 37% to 40% margin. The EPCY 7351 yields per-core performance margins of
54% to 58%.

ANSYS Fluent - External Flow over a Formula-1 Race Car (7351-f1_racecar_140m)


EPYC 7351 (16 cores) vs. EPYC 7451 (24 cores) vs. Xeon 2695v4 (18 cores)1
1800
32 Nodes

1600

1400
Core Solver Rating (higher is better)

1200

1000
16 Nodes

800

600
8 Nodes

400

4 Nodes
200
2 Nodes

0
0 5 10 15 20 25 30 35
# of Nodes

EPYC 7451 (24 cores) EPYC 7351 (16 cores) Xeon 2695v4 (18 cores)

2018 © Advanced Micro Devices, Inc. 7


ANSYS Fluent Performance: External Flow over an Open Wheel Race Car (open_racecar_280m)
This case has around 280 million Hex-core cells and uses the realizable k-ε turbulence model and the
Pressure based coupled solver, cell based, pseudo transient solver. The starting node count for this
comparison is 16 nodes, based on posted data for the Intel Xeon E5-2695v4.

EPYC 7451 maintains performance leadership of 70-72% over Intel Xeon E5-2695v4, while EPYC 7351 holds
at 36% per-node and 53% per-core performance leadership.

ANSYS Fluent - External Flow over an Open Wheel Race Car (7351-open_racecar_280m)
EPYC 7351 (16 cores) vs. EPYC 7451 (24 cores) vs. Xeon 2695v4 (18 cores)1
1400

32 Nodes
1200

1000
Core Solver Rating (higher is better)

800

16 Nodes
600

400
8 Nodes

200 4 Nodes

0
0 5 10 15 20 25 30 35
# of Nodes

EPYC 7451 (24 cores) EPYC 7351 (16 cores) Xeon 2695v4 (18 cores)

2018 © Advanced Micro Devices, Inc. 8


License Costs and Per-core Performance: EPYC 7351
Maximizing software license investments is critically important in the HPC market. At all node counts across
these five large test cases, the EPYC 7351 posts significant leads in overall performance, but even more
impressive leads in per-core performance, by up to 65% – critically important to maximize software license
investments.

The EPYC 7451 posts dominant overall performance leads of up to 86%. And, even with 12 more cores per
node than the Intel Xeon E5-2695v4, the EPYC 7451 also posted very impressive per-core performance
leadership of up to 39%.

CFD workloads are complex and require finding the right balance of floating-point performance and memory
bandwidth. The exceptional memory bandwidth on EPYC server processors tilts the balance of system
performance. More bandwidth per system means more bandwidth is available to allocate across more cores,
allowing more cores to be efficiently added per system while maintaining per-core performance. As you can
see below, larger models scale better at higher core counts than smaller models.

Relative Performance Per Core at Scale - EPYC 7351 vs. Xeon 2695v4
(performance scaled to Xeon 2695v4=1.0 for each benchmark)1
1.8

1.6

1.4
Relative Performance Per Core

1.2
(higher is better)

Xeon 2695v4
1
7351-landing_gear_15m
7351-exhaust_system_33m
0.8
7351-combustor_71m

0.6 7351-f1_racecar_140m
7351-open_racecar_280m
0.4

0.2

0
1 2 4 8 16 32
# of Nodes

2018 © Advanced Micro Devices, Inc. 9


Conclusion
Scale-out testing on the 32-node EPYC cluster Together, AMD and ANSYS empower the
shows impressive results on these large models. development of fast, accurate Computational Fluid
Results showed a general, and expected, trend of Dynamics simulations running on cost-effective
better scaling as the model sizes increased. clustered systems.
Pure performance was highest with the 24-core For more information about AMD’s EPYC line of
EPYC 7451. Per-core performance was highest processors visit: http://www.amd.com/epyc
with the 16-core EPYC 7351, which is important For more information about ANSYS visit:
for maximizing your software investment. http://www.ANSYS.com
Whether you need the dominating system level 1ANSYS benchmarks can be found here:
performance and density of the EPYC 7451, or https://www.ansys.com/solutions/solutions-by-
the equally dominating per-core performance of role/it-professionals/platform-
the EPCY 7351, both products offer exceptional support/benchmarks-overview/ansys-fluent-
memory bandwidth. And, both provide your benchmarks, Intel results not independently tested
organization a significant advantage. by AMD.
ANSYS Fluent Computational Fluid Dynamics Authors
(CFD) application is architected to deliver
accuracy, performance, and scalability to meet This paper is authored by Kevin Mayo in
your CFD needs, empowering you to go further collaboration with Marc Baker and Anre Kashyap.
and faster as you optimize your product's
performance. Fluent includes well-validated
physical modeling capabilities to deliver fast,
accurate results across the widest range of CFD
and multi-physics applications.

DISCLAIMER

The information contained herein is for informational purposes only and is subject to change without notice. While every precaution has been
taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no
obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to
the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of
noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other
products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document.
Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD's
Standard Terms and Conditions of Sale. GD-18

©2018 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, and combinations thereof are trademarks of Advanced
Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective
companies.

ANSYS, the ANSYS logo, the ANSYS Preferred Solution Partner logo, and FLUENT are trademarks or registered trademarks, of ANSYS, . Other
product names used in this publication are for identification purposes only and may be trademarks of their respective companies

2018 © Advanced Micro Devices, Inc. 10

You might also like