NVIDIA Posts Big AI Numbers In MLPerf Inference v3.1 Benchmarks With Hopper H100, GH200 Superchips & L4 GPUs

NVIDIA has released its official MLPerf Inference v3.1 performance benchmarks running on the world's fastest AI GPUs such as Hopper H100, GH200 & L4.

NVIDIA Dominates The AI Landscape With Hopper & Ada Lovelace GPUs, Strong Performance Showcased In MLPerf v3.1

Today, NVIDIA is releasing its first performance benchmarks within the MLPerf Inference v3.1 benchmark suite which covers a wide range of industry-standard benchmarks for AI use cases. These workloads range from Recommender, Natural Language Processing, Large Language Model, Speech Recognition, Image Classification, Medical Imaging, and Object Detection.

The two new sets of benchmarks include DLRM-DCNv2 and GPT-J 6B. The first is a larger multi-hot dataset representation of real recommenders which uses a new cross-layer algorithm to deliver better recommendations and has twice the parameter count versus the previous version. GPT-J on the other other is a small-scale LLM that has a base model that's open source and was released in 2021. This workload is designed for summarization tasks.

NVIDIA also showcases a conceptual real-life workload pipeline of an application that utilizes a range of AI models to achieve a required query or task. All of the models will be available on the NGC platform.

nvidia-mlperf-inference-v3-1-hopper-h100-grace-hopper-gh200-l4-gpu-performance-_4

nvidia-mlperf-inference-v3-1-hopper-h100-grace-hopper-gh200-l4-gpu-performance-_5

In terms of performance benchmarks, the NVIDIA H100 was tested across the entire MLPerf v3.1 Inference set (Offline) against competitors from Intel (HabanaLabs), Qualcomm (Cloud AI 100) and Google (TPUv5e). NVIDIA delivered leadership performance across all workloads.

To make things a little more interesting, the company states that these benchmarks were achieved about a month ago since MLPerf requires at least 1 month between the submission time for the final results to be published. Since then, NVIDIA has come up with a new technology known as TensorRT-LLM which further boosts performance by up to 8x as we detailed here. We can expect NVIDIA to submit the MLPerf benchmarks with TensorRT-LLM soon too.

But coming back to the benchmarks, NVIDIA's GH200 Grace Hopper Superchip also made its first submission on MLPerf, yielding a 17% improvement over the H100 GPU. This performance gain is mainly coming from higher VRAM capacities (96 GB HBM3 vs. 80 GB HBM3) and 4TB/s bandwidth.

The Hopper GH200 GPU utilizes the same core configuration as the H100 but one key area that's assisting in the boosted performance is the automatic power steering between the Grace CPU and the Hopper GPU. Since the Superchip platform includes power delivery for both the CPU and GPU on the same board, customers can essentially switch the power from the CPU to the GPU and vice versa in any particular workload. This extra juice on the GPU can make the chip clock faster and run faster. NVIDIA also mentioned that the Superchip here was running the 1000W configuration.

nvidia-mlperf-inference-v3-1-hopper-h100-grace-hopper-gh200-l4-gpu-performance-_10

nvidia-mlperf-inference-v3-1-hopper-h100-grace-hopper-gh200-l4-gpu-performance-_11

In its debut on the MLPerf industry benchmarks, the NVIDIA GH200 Grace Hopper Superchip ran all data center inference tests, extending the leading performance of NVIDIA H100 Tensor Core GPUs. The overall results showed the exceptional performance and versatility of the NVIDIA AI platform from the cloud to the network’s edge.

The GH200 links a Hopper GPU with a Grace CPU in one superchip. The combination provides more memory, bandwidth and an ability to automatically shift power between the CPU and GPU to optimize performance. Separately, H100 systems that pack eight H100 GPUs delivered the highest throughput on every MLPerf inference test in this round.

Grace Hopper Superchips and H100 GPUs led across all MLPerf’s data center tests, including inference for computer vision, speech recognition and medical imaging, in addition to the more demanding use cases of recommendation systems and the large language models (LLMs) used in generative AI. Overall, the results continue NVIDIA’s record of demonstrating performance leadership in AI training and inference in every round since the launch of the MLPerf benchmarks in 2018.

via NVIDIA

The NVIDIA L4 GPU which is based on the Ada Lovelace GPU architecture also made a strong entry in MLPerf v3.1. It was not only able to run all workloads but did so very efficiently, running up to 6x faster than modern x86 CPUs (Intel 8380 Dual-Socket) at a 72W TDP in an FHFL form factor. The L4 GPU also offered a 120x increase in Video/AI tasks such as Decoding, Inferencing, Encoding. Lastly, the NVIDIA Jetson Orion got an up to 84% performance boost thanks to software updates & shows NVIDIA's commitment to improving the software stack to the next level.

Written by Hassan Mujtaba

Wccftech Continue reading/original-link]

Ukraine is pushing for EU membership. But what are the real chances?

Europe looks for alternate gas solutions but could it be left in cold?

More people in need of charity in Europe since COVID-19, NGO says

Eight Bulgarians among 11 missing after fire on ship near Corfu

Near the frontline in eastern Ukraine, snipers and scepticism abound

War in Ukraine will not be short, and it’s changed everything for Europe

WA records 1,766 new local COVID cases as it prepares to open border

Clive Palmer may have just bought Hitler’s car, say Liberals and Labor

Mud Army 2.0 urged to check with home owners before tossing things out

Ramping cut almost in half in last four months, SA government says

Nordstrom shares soar as it makes ‘baby steps’, still has a ways to go

Target thinks it can keep growing sales, here’s how the retailer will do it

AMC is charging more for ‘Batman’ tickets as it tests out a new pricing model

Benioff touts Salesforce’s sales guidance, ‘$30 billions are ahead of us’

Meta says today’s cellular networks aren’t ready for the metaverse

Skyrim Co-Op Mod Released, Mostly Actually Works

Can you name Barca’s starting XI from last Europa League appearance?

After scoring confirmed, should Taylor offer Catterall a rematch?

The ‘internal battle’ when counter culture meets elite sport

‘Messi-inspired’ Grealish helps Man City beat Peterborough in match

A newfound quasicrystal formed in the first atomic bomb testesd in US

How omicron’s mutations make it the most infectious coronavirus variant

Africa’s fynbos plants hold their ground with the world’s thinnest roots

‘Fresh Banana Leaves’ shows how Indigenous people have been harmed

A fast radio burst’s unlikely source may be a cluster of old stars

NVIDIA Posts Big AI Numbers In MLPerf Inference v3.1 Benchmarks With Hopper H100, GH200 Superchips & L4 GPUs

NVIDIA Dominates The AI Landscape With Hopper & Ada Lovelace GPUs, Strong Performance Showcased In MLPerf v3.1

Related articles

Meet the 2025 Ig Nobel Prize winners

Amazon offers another chance to grab Marshall’s Cream Willen II Bluetooth portable speaker at its $100 low (Reg. $130)

Breathable mesh design and $20 pricing headlines Bruno Marc’s oxford shoes (44% off)

Remove spots, stains, odors, and hair with Shark’s StainStriker HairPro portable cleaner at $130 (Reg. $170)

Recent articles

Meet the 2025 Ig Nobel Prize winners

Amazon offers another chance to grab Marshall’s Cream Willen II Bluetooth portable speaker at its $100 low (Reg. $130)

Breathable mesh design and $20 pricing headlines Bruno Marc’s oxford shoes (44% off)

Remove spots, stains, odors, and hair with Shark’s StainStriker HairPro portable cleaner at $130 (Reg. $170)