Run This Ai
EN DE

Llamafile

Distribute and run LLMs as single-file executables — no installation needed

★ 25,099 GitHub Apache-2.0 llminferencesingle-filellama-cppmozillaself-hostedlocal-llm LLM & Chat

Overview

Llamafile is a groundbreaking Mozilla project that collapses all the complexity of LLMs into a single-file executable. Built on llama.cpp and Cosmopolitan Libc, it lets you run powerful language models on virtually any operating system (macOS, Linux, Windows, FreeBSD) and CPU architecture without any installation or dependencies. Just download, make executable, and run. With 25k+ GitHub stars, it also includes whisperfile for single-file speech-to-text. Llamafile supports a wide range of open models including Llama, Mistral, Qwen, and more, making local LLM inference truly accessible to everyone.

Requirements

Min vCPU
1
Min RAM
1024 MB
Min Disk
10 GB
Rec vCPU
4
Rec RAM
4096 MB
Rec Disk
20 GB

Recommended VPS

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Contabo · VPS S

4 vCPU · 8192 MB · 100 GB

$4.50
View plan

Affiliate disclosure

Docker Compose

# Generated by Run This Ai — docker-compose.yml
services:
  llamafile:
    image: ghcr.io/mozilla-ai/llamafile:latest
    restart: unless-stopped
    ports:
      - 8080:8080
    volumes:
      - ./data/llamafile:/data

Best VPS for Llamafile →

Related tools

Guides & articles