Linux Follies: computational fluid dynamics

2023-05-26

GPUs in CFD

In a chat recently, I heard that computational fluid dynamics (CFD) can’t take advantage of GPUs. That seemed a bit doubtful to me, so I looked it up. Seems like there has been some work recently that showed how use of GPUs greatly accelerate CFD workloads.

This press release on OpenACC’s website talks about how a private company (AeroDynamic Solutions, Inc. (ADSCFD)) used OpenACC to give their proprietary CFD solver Code LEO GPU capabilities, with very good speedup.

By using OpenACC to GPU-accelerate their commercial flow solver, ADSCFD achieved significant value. They realized dramatically improved performance across multiple use cases with speed-ups ranging from 20 to 300 times, reductions in cost to solution of up to 70%, and access to analyses that were once deemed infeasible to instead being achieved within a typical design cycle.

Similar blog posts from Nvidia and ANSYS+Nvidia last year also show significant speedups (between 12x and 33x) and significant power consumption savings, as well.

Nvidia’s blog post show results from a beta version of ANSYS Fluent and Simcenter STAR-CCM+.

Figure 2 shows the performance of the first release of Simcenter STAR-CCM+ 2022.1 against commonly available CPU-only servers. For the tested benchmark, an NVIDIA GPU-equipped server delivers results almost 20x faster than over 100 cores of CPU.

…
The performance of the Ansys Fluent 2022 beta1 server compared to CPU-only servers shows that Intel Xeon, AMD Rome, and AMD Milan had ~1.1x speedups compared to the NVIDIA A100 PCIe 80GB, which had speedups from 5.2x (one GPU) to an impressive 33x (eight GPUs).

ANSYS’s blog post covers the same result as Nvidia, showing 33x speedup using 8 A100 GPUs. They also do a cost comparison of equal-speed clusters, one using GPUs and the other purely CPUs:

1 NVIDIA A100 GPU ≈ 272 Intel® Xeon® Gold 6242 Cores

Comparing the older V100 GPUs with Intel® Xeon® Gold 6242, the 6x V100 GPU cluster would cost $71,250 while the equivalent CPU-only cluster would cost $500,000, i.e. about one seventh the price.

2020-11-24

New Cerebras wafer-scale single server outperforms Joule supercomputer

From HPC Wire:

Cerebras Systems, a pioneer in high performance artificial intelligence (AI) compute, today announced record-breaking performance on a scientific compute workload. In collaboration with the Department of Energy’s National Energy Technology Laboratory (NETL), Cerebras demonstrated its CS-1 delivering speeds beyond what either CPUs or GPUs are currently able to achieve. Specifically, the CS-1 was 200 times faster than the Joule Supercomputer on the key workload of Computational Fluid Dynamics (CFD).

While Cerebras’s CS-1 system is billed as an AI-focused machine, it outdid the Joule Supercomputer (number 82 in the TOP500) on a non-AI workload. While the Joule cost 10’s of millions of dollars, occupies dozens of racks, and consumes 450 kW of power, the CS-1 fits in only one-third of a rack.

Cerebras has a good write-up on their blog. Gory detail in the preprint: arXiv:2010.03660 [cs.DC].