GPU Acceleration - Project N.O.M.A.D.

Why GPU acceleration matters

Running AI models on CPU alone is functional but slow. With a compatible NVIDIA GPU, inference speed improves dramatically:

Setup	Typical speed
CPU only	10–15 tokens/second
NVIDIA GPU (accelerated)	100+ tokens/second

That is a 10–20x improvement — the difference between waiting 30 seconds for a response and getting one in under 3 seconds.

Requirements

A server with a compatible NVIDIA GPU installed
The NVIDIA Container Toolkit installed on the host OS

N.O.M.A.D. detects GPU availability automatically when the Container Toolkit is present. You do not need to configure anything manually beyond the toolkit installation.

AMD GPUs are not currently supported for GPU-accelerated AI inference in N.O.M.A.D.

Setup

Install an NVIDIA GPU

Install a compatible NVIDIA GPU in your server if one is not already present. Any modern NVIDIA GPU with sufficient VRAM will work. More VRAM allows you to run larger models.For build recommendations at different price points, see the Hardware Guide.

Install the NVIDIA Container Toolkit

Follow the official installation guide for the NVIDIA Container Toolkit on your host OS:https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.htmlThis toolkit allows Docker containers to access the GPU. Without it, containers (including the AI Assistant) cannot use the GPU even if one is present.

Force reinstall the AI Assistant

N.O.M.A.D. detects the GPU during the AI Assistant installation process. If the AI Assistant was already installed before you set up the toolkit, you need to reinstall it:

Go to Settings → Apps
Find AI Assistant
Click Force Reinstall

During reinstall, N.O.M.A.D. detects the NVIDIA Container Toolkit and configures the AI container with GPU support. You will see NVIDIA container runtime detected in the installation progress log.

Force Reinstall preserves your downloaded AI models. You will not need to re-download them.

After upgrading or swapping your GPU

If you add a new GPU or swap to a different GPU after the initial AI Assistant install, the AI container still has its old configuration and will continue running on CPU only. The fix is the same: Force Reinstall the AI Assistant from Settings → Apps. This recreates the container with the correct GPU configuration for your current hardware.

”GPU passthrough not working” warning

If you see a “GPU passthrough not working” banner on the System Information or AI Settings pages, it means N.O.M.A.D. detected a GPU on the host but the AI container cannot access it.

This typically happens when:

The NVIDIA Container Toolkit was installed after the AI Assistant
The GPU was added or swapped after the AI Assistant was installed
The container runtime configuration changed

To fix it: Click the “Fix: Reinstall AI Assistant” button shown in the warning banner, or manually go to Settings → Apps → AI Assistant → Force Reinstall. This recreates the container with proper GPU access and preserves your downloaded models.

Measuring the improvement

Run the System Benchmark before and after enabling GPU acceleration to measure the performance difference:

Go to Settings → System Benchmark
Run a Full Benchmark before making any changes
Enable GPU acceleration (steps above)
Run the Full Benchmark again
Compare your NOMAD Score and tokens-per-second results

You can also submit your score to the community leaderboard to see how your build compares.

Documentation Index

​Why GPU acceleration matters

​Requirements

​Setup

​After upgrading or swapping your GPU

​”GPU passthrough not working” warning

​Measuring the improvement

Why GPU acceleration matters

Requirements

Setup

After upgrading or swapping your GPU

”GPU passthrough not working” warning

Measuring the improvement