llama.cpp
LLM inference in C/C++
Überblick
High-performance LLaMA inference on CPU and GPU, with a built-in server and Python bindings.
Anforderungen
Min vCPU
2
Min RAM
4096 MB
Min Disk
20 GB
Rec vCPU
4
Rec RAM
8192 MB
Rec Disk
40 GB
Empfohlener VPS
Contabo · VPS S
4 vCPU · 8192 MB · 100 GB
Contabo · VPS S
4 vCPU · 8192 MB · 100 GB
Contabo · VPS S
4 vCPU · 8192 MB · 100 GB
Affiliate-Hinweis
Docker Compose
# Generated by Run This Ai — docker-compose.yml
services:
llama-cpp:
image: ghcr.io/ggerganov/llama.cpp:server
restart: unless-stopped
ports:
- 8080:8080
volumes:
- ./data/llama-cpp:/data