How to Install llama.cpp on Ubuntu 24.04

A step-by-step installation guide.

Jun 27, 2026

High-performance LLaMA inference on CPU and GPU, with a built-in server and Python bindings.

Prerequisites

At least 2 CPU cores, 4096 MB RAM, 20 GB disk.

Install llama.cpp via Docker.

docker pull ghcr.io/ggerganov/llama.cpp:server

#install_guide