Nvidia’s “Chat With RTX” is a ChatGPT-style app that runs on your own GPU

On Tuesday, Nvidia released Chat With RTX, a free personalized AI chatbot similar to ChatGPT that can run locally on a PC with an Nvidia RTX graphics card. It uses Mistral or Llama open-weights LLMs and can search through local files and answer questions about them.

Chat With RTX works on Windows PCs equipped with NVIDIA GeForce RTX 30 or 40 Series GPUs with at least 8GB of VRAM. It uses a combination of retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software, and RTX acceleration to enable generative AI capabilities directly on users' devices. This setup allows for conversations with the AI model using local files as a dataset.

"Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers," writes Nvidia in a promotional blog post.

A screenshot of Chat With RTX, which runs in a web browser window. (credit: Benj Edwards)

Using Chat With RTX, users can talk about various subjects or ask the AI model to summarize or analyze data, similar to how one might interact with ChatGPT. In particular, the Mistal-7B model has built-in conditioning to avoid certain sensitive topics (like sex and violence, of course), but users could presumably somehow plug in an uncensored AI model and discuss forbidden topics without the paternalism inherent in the censored models.

Also, the application supports a variety of file formats, including .TXT, .PDF, .DOCX, and .XML. Users can direct the tool to browse specific folders, which Chat With RTX then scans to answer queries quickly. It even allows for the incorporation of information from YouTube videos and playlists, offering a way to include external content in its database of knowledge (in the form of embeddings) without requiring an Internet connection to process queries.

Rough around the edges

We downloaded and ran Chat With RTX to test it out. The download file is huge, at around 35 gigabytes, owing to the Mistral and Llama LLM weights files being included in the distribution. ("Weights" are the actual neural network files containing the values that represent data learned during the AI training process.) When installing, Chat With RTX downloads even more files, and it executes in a console window using Python with an interface that pops up in a web browser window.

Several times during our tests on an RTX 3060 with 12GB of VRAM, Chat With RTX crashed. Like open source LLM interfaces, Chat With RTX is a mess of layered dependencies, relying on Python, CUDA, TensorRT, and others. Nvidia hasn't cracked the code for making the installation sleek and non-brittle. It's a rough-around-the-edges solution that feels very much like an Nvidia skin over other local LLM interfaces (such as GPT4ALL). Even so, it's notable that this capability is officially coming directly from Nvidia.

On the bright side (a massive bright side), local processing capability emphasizes user privacy, as sensitive data does not need to be transmitted to cloud-based services (such as with ChatGPT). Using Mistral 7B feels similarly capable to early 2022-era GPT-3, which is still remarkable for a local LLM running on a consumer GPU. It's not a true ChatGPT replacement yet, and it can't touch GPT-4 Turbo or Google Gemini Pro/Ultra in processing capability.

Nvidia GPU owners can download Chat With RTX for free on the Nvidia website.

Read on Ars Technica | Comments

Ars Technica - All content Continue reading/original-link]

Ukraine is pushing for EU membership. But what are the real chances?

Europe looks for alternate gas solutions but could it be left in cold?

More people in need of charity in Europe since COVID-19, NGO says

Eight Bulgarians among 11 missing after fire on ship near Corfu

Near the frontline in eastern Ukraine, snipers and scepticism abound

War in Ukraine will not be short, and it’s changed everything for Europe

WA records 1,766 new local COVID cases as it prepares to open border

Clive Palmer may have just bought Hitler’s car, say Liberals and Labor

Mud Army 2.0 urged to check with home owners before tossing things out

Ramping cut almost in half in last four months, SA government says

Nordstrom shares soar as it makes ‘baby steps’, still has a ways to go

Target thinks it can keep growing sales, here’s how the retailer will do it

AMC is charging more for ‘Batman’ tickets as it tests out a new pricing model

Benioff touts Salesforce’s sales guidance, ‘$30 billions are ahead of us’

Meta says today’s cellular networks aren’t ready for the metaverse

Skyrim Co-Op Mod Released, Mostly Actually Works

Can you name Barca’s starting XI from last Europa League appearance?

After scoring confirmed, should Taylor offer Catterall a rematch?

The ‘internal battle’ when counter culture meets elite sport

‘Messi-inspired’ Grealish helps Man City beat Peterborough in match

A newfound quasicrystal formed in the first atomic bomb testesd in US

How omicron’s mutations make it the most infectious coronavirus variant

Africa’s fynbos plants hold their ground with the world’s thinnest roots

‘Fresh Banana Leaves’ shows how Indigenous people have been harmed

A fast radio burst’s unlikely source may be a cluster of old stars

Nvidia’s “Chat With RTX” is a ChatGPT-style app that runs on your own GPU

Rough around the edges

Related articles

How To Unlock Every Hero And Weapon Evolution In Vampire Survivors Ode To Castlevania DLC

Overwatch Players, Y’all Lived Like This In 2016?

Is Black Myth: Wukong Coming To Xbox? Phil Spencer Knows, But Won’t Say

Best Android app price drops and freebies: Doom & Destiny Worlds, YoWindow Weather, more

Recent articles

How To Unlock Every Hero And Weapon Evolution In Vampire Survivors Ode To Castlevania DLC

Overwatch Players, Y’all Lived Like This In 2016?

Is Black Myth: Wukong Coming To Xbox? Phil Spencer Knows, But Won’t Say

Best Android app price drops and freebies: Doom & Destiny Worlds, YoWindow Weather, more