NVIDIA TensorRT Accelerates Stable Diffusion GenAI For All RTX GPUs, RTX 4090 Up To 7x Faster Than Apple M2 Ultra

NVIDIA's TensorRT updates for RTX GPUs also enable some big performance uplifts to GenAI workloads such as Stable Diffusion.

Stable Diffusion & GenAI Gets Boost Through TensorRT Support on NVIDIA's Gaming & Pro RTX GPUs

We have already detailed how TensorRT-LLM is bringing faster AI capabilities to Windows on RTX hardware and GenAI is another area where consumers owning an RTX GPU will be able to see a direct benefit.

It's no secret that NVIDIA's GPUs are amongst the most popular solutions for Stable Diffusion and Generative AI workloads. We have seen NVIDIA being ahead of almost everyone in this field but the most recent and upcoming CPU (from AMD & Intel) launches have started to include a dedicated NPU unit that can offload the AI tasks from the CPU/GPU & complete the work in a very low-power & efficient mode for the vast majority of users.

NVIDIA states that it is great to see the push to accelerate AI being made by infusing CPUs with AI and they will mostly be used for lightweight AI tasks running at low power and the GPU is going to be for more demanding use cases. Both NPU and GPUs are offline and locally available resources, providing low latency and data locality/privacy features while cloud data centers are targetting the Heavy AI workloads for very large models and use-on-demand. NVIDIA's RTX GPUs are said to offer anywhere from 20x to 100x more performance than these NPUs.

TensorRT acceleration is now available for Stable Diffusion in the popular Web UI by Automatic1111 distribution. It speeds up the generative AI diffusion model by up to 2x over the previous fastest implementation.

via NVIDIA

In a Stable Diffusion performance demonstration, NVIDIA shows the GeForce RTX 4090 running WebUI from Automatic 1111 and outputting 27 images per minute using the PyTorc xFormers implementation but running it with TensorRT doubles the performance to 52 images per minute.

NVIDIA also compares the performance against Apple's M2 Ultra (72 Core Variant) which costs a base price of $5000 US. This system only outputs 7 Images per minute using the CoreML model. Meanwhile, you can build a very high-end system with two GeForce RTX 4090 GPUs on the same budget.

The company has announced that TensorRT is now available in WebUI (Automatic 1111) and is available to download from GitHub.com/NVIDIA.

Written by Hassan Mujtaba

Wccftech Continue reading/original-link]

Ukraine is pushing for EU membership. But what are the real chances?

Europe looks for alternate gas solutions but could it be left in cold?

More people in need of charity in Europe since COVID-19, NGO says

Eight Bulgarians among 11 missing after fire on ship near Corfu

Near the frontline in eastern Ukraine, snipers and scepticism abound

War in Ukraine will not be short, and it’s changed everything for Europe

WA records 1,766 new local COVID cases as it prepares to open border

Clive Palmer may have just bought Hitler’s car, say Liberals and Labor

Mud Army 2.0 urged to check with home owners before tossing things out

Ramping cut almost in half in last four months, SA government says

Nordstrom shares soar as it makes ‘baby steps’, still has a ways to go

Target thinks it can keep growing sales, here’s how the retailer will do it

AMC is charging more for ‘Batman’ tickets as it tests out a new pricing model

Benioff touts Salesforce’s sales guidance, ‘$30 billions are ahead of us’

Meta says today’s cellular networks aren’t ready for the metaverse

Skyrim Co-Op Mod Released, Mostly Actually Works

Can you name Barca’s starting XI from last Europa League appearance?

After scoring confirmed, should Taylor offer Catterall a rematch?

The ‘internal battle’ when counter culture meets elite sport

‘Messi-inspired’ Grealish helps Man City beat Peterborough in match

A newfound quasicrystal formed in the first atomic bomb testesd in US

How omicron’s mutations make it the most infectious coronavirus variant

Africa’s fynbos plants hold their ground with the world’s thinnest roots

‘Fresh Banana Leaves’ shows how Indigenous people have been harmed

A fast radio burst’s unlikely source may be a cluster of old stars

NVIDIA TensorRT Accelerates Stable Diffusion GenAI For All RTX GPUs, RTX 4090 Up To 7x Faster Than Apple M2 Ultra

Stable Diffusion & GenAI Gets Boost Through TensorRT Support on NVIDIA's Gaming & Pro RTX GPUs

Related articles

How To Unlock Every Hero And Weapon Evolution In Vampire Survivors Ode To Castlevania DLC

Overwatch Players, Y’all Lived Like This In 2016?

Is Black Myth: Wukong Coming To Xbox? Phil Spencer Knows, But Won’t Say

Best Android app price drops and freebies: Doom & Destiny Worlds, YoWindow Weather, more

Recent articles

How To Unlock Every Hero And Weapon Evolution In Vampire Survivors Ode To Castlevania DLC

Overwatch Players, Y’all Lived Like This In 2016?

Is Black Myth: Wukong Coming To Xbox? Phil Spencer Knows, But Won’t Say

Best Android app price drops and freebies: Doom & Destiny Worlds, YoWindow Weather, more