EN DE

Llamafile

Distribute and run LLMs as single-file executables — no installation needed

★ 25,099 GitHub Apache-2.0 llminferencesingle-filellama-cppmozillaself-hostedlocal-llm LLM & Chat

Overview

Llamafile is a groundbreaking Mozilla project that collapses all the complexity of LLMs into a single-file executable. Built on llama.cpp and Cosmopolitan Libc, it lets you run powerful language models on virtually any operating system (macOS, Linux, Windows, FreeBSD) and CPU architecture without any installation or dependencies. Just download, make executable, and run. With 25k+ GitHub stars, it also includes whisperfile for single-file speech-to-text. Llamafile supports a wide range of open models including Llama, Mistral, Qwen, and more, making local LLM inference truly accessible to everyone.

Requirements

Min vCPU

Min RAM

1024 MB

Min Disk

10 GB

Rec vCPU

Rec RAM

4096 MB

Rec Disk

20 GB

Recommended VPS

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50

View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50

View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50

View plan

Affiliate disclosure

Docker Compose

# Generated by Run This Ai — docker-compose.yml
services:
  llamafile:
    image: ghcr.io/mozilla-ai/llamafile:latest
    restart: unless-stopped
    ports:
      - 8080:8080
    volumes:
      - ./data/llamafile:/data

Best VPS for Llamafile →