Run This Ai
EN DE

llama.cpp

LLM inference in C/C++

Overview

High-performance LLaMA inference on CPU and GPU, with a built-in server and Python bindings.

Requirements

Min vCPU
2
Min RAM
4096 MB
Min Disk
20 GB
Rec vCPU
4
Rec RAM
8192 MB
Rec Disk
40 GB

Recommended VPS

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Affiliate disclosure

Docker Compose

# Generated by Run This Ai — docker-compose.yml
services:
  llama-cpp:
    image: ghcr.io/ggerganov/llama.cpp:server
    restart: unless-stopped
    ports:
      - 8080:8080
    volumes:
      - ./data/llama-cpp:/data

Best VPS for llama.cpp →

Related tools

Guides & articles