a2go

a2go helps you run open-source AI models on your own hardware — locally, on a cloud GPU, or on a Mac.

there are dozens of open-source models (LLMs, image gen, audio/TTS) in different quantizations, and figuring out which ones actually fit your GPU, how much VRAM they really need, and what performance you'll get is a pain — the info is scattered across huggingface cards, reddit, github issues, and trial-and-error
a2go bundles all of that into a web configurator where you pick your GPU and it shows you exactly what fits, with real VRAM breakdowns (weights + kv cache + overhead — not just the file size), tokens-per-second benchmarks measured on actual hardware, and a live memory gauge that updates as you combine models
every model in the registry has been tested on real GPUs — the numbers aren't theoretical, they account for things model cards never mention like compute graph buffers and runtime overhead
when you're done picking, it generates a copy-paste deploy command — docker for linux/windows/runpod, mlx commands for mac — no config files to write, no flags to guess
it supports multi-model setups (run an LLM + image gen + audio on the same GPU and see if it all fits), multi-gpu splitting, platform-aware variants (gguf for nvidia, mlx for apple silicon), and context length sliders that show you the real memory cost
the whole point is: we already tested all of this so you don't have to — stop downloading models that don't fit, stop guessing at flags, just pick and deploy

Image: runpod/a2go:latest (~7 GB compressed)

Quick start

Pick models at a2go.run — select your GPU and the site shows what fits
Read the security guide — OpenClaw agents can execute shell commands, read/write files, and fetch URLs on your machine. Understand what you're running: Security Guide
Deploy — the site generates a ready-to-use command (Docker or MLX)
Access the UI — http://localhost:18789/?token=<A2GO_AUTH_TOKEN>

On Runpod the URL is https://<pod-id>-18789.proxy.runpod.net/?token=<A2GO_AUTH_TOKEN>.

First time: approve device pairing when prompted (SSH into the machine, run openclaw devices list then openclaw devices approve <requestId>).

Docker (Linux / Windows / Runpod)

The site generates this — or run it directly:

docker run --gpus all \
  -e A2GO_CONFIG='{"llm":"unsloth/GLM-4.7-Flash-GGUF","audio":"LiquidAI/LFM2.5-Audio-1.5B-GGUF"}' \
  -e A2GO_AUTH_TOKEN=changeme \
  -e A2GO_API_KEY=changeme \
  -p 8000:8000 -p 8080:8080 -p 18789:18789 \
  -v a2go-models:/workspace \
  runpod/a2go:latest

Models download on first start and persist on the volume.

Environment variables

Variable	Description	Default
`A2GO_CONFIG`	JSON config — models to load	`{}` (auto-detect)
`A2GO_AUTH_TOKEN`	Web UI + API auth token	`changeme`
`A2GO_API_KEY`	LLM API key (OpenAI-compatible endpoint)	`changeme`
`TELEGRAM_BOT_TOKEN`	Enable Telegram bot integration	—
`GITHUB_TOKEN`	GitHub auth for Claude Code	—

Model names are case-insensitive. Use HuggingFace repo names or short IDs.

Auto-detect

When A2GO_CONFIG is {} (the default), the container reads your GPU's VRAM via nvidia-smi and picks a default LLM that fits automatically. Useful when you just want something running without choosing.

Ports

Port	Service
8000/http	LLM API (OpenAI-compatible)
8001/http	Media server (image gen, TTS — internal)
8080/http	Media proxy + web UI
18789/http	OpenClaw control UI + chat
22/tcp	SSH

CLI tools

a2go models                    # List available models
a2go fit                       # Show what fits on this GPU
a2go presets                   # List preset profiles
a2go registry status           # Registry source + cache info
a2go tool image-generate --prompt "A cat"   # Generate image
a2go tool text-to-speech "Hello world"     # Text to speech
a2go tool speech-to-text audio.wav         # Speech to text

Customize agent behavior

/workspace/ is persistent storage that survives pod restarts. All models and config live here.

Path	Purpose
`/workspace/.openclaw/openclaw.json`	Main config — auto-generated on first boot, editable
`/workspace/openclaw/IDENTITY.md`	Agent identity — create your own to customize personality
`/workspace/openclaw/AGENTS.md`	Agent instructions & skills — create your own to add capabilities

The entrypoint only generates openclaw.json if it doesn't exist, so your edits are safe across restarts.

MLX (macOS / Apple Silicon)

On a Mac, you don't use Docker. Instead, you run model servers natively using Apple's MLX framework, which is optimized for Apple Silicon.

Select macOS on a2go.run and the site generates the exact commands for your selected models. Here's what the flow looks like:

# 1. Create a virtual environment
python3 -m venv ~/.a2go/venv
source ~/.a2go/venv/bin/activate

# 2. Install engines (the site tells you which ones you need)
pip install mlx-lm        # for LLM models
pip install mlx-audio      # for audio models

# 3. Start servers (each in a separate terminal)
python -m mlx_lm.server --model <repo> --host 0.0.0.0 --port 8000
python -m mlx_audio.server --host 0.0.0.0 --port 8001

This only starts the model servers. To connect OpenClaw, you need a config file that tells the agent framework where to find them. The site generates this for you — save it as ~/.openclaw/openclaw.json.

For installing the agent framework itself (OpenClaw gateway + UI), see the OpenClaw install docs.

Not all models have MLX variants — the site will tell you which ones do.

Resources

a2go.run — model configurator
Security Guide — trust model, access control, hardening
OpenClaw — the agent framework
Runpod — GPU cloud
Contributing models

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.changeset		.changeset
.claude/skills/add-model		.claude/skills/add-model
.github		.github
a2go		a2go
config		config
docs		docs
engines		engines
fork		fork
media		media
plugins/toolresult-images		plugins/toolresult-images
registry		registry
scripts		scripts
site		site
skills/a2go		skills/a2go
templates/runpod		templates/runpod
tests		tests
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.unified		Dockerfile.unified
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

a2go

Quick start

Docker (Linux / Windows / Runpod)

Environment variables

Auto-detect

Ports

CLI tools

Customize agent behavior

MLX (macOS / Apple Silicon)

Resources

About

Uh oh!

Releases 40

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

a2go

Quick start

Docker (Linux / Windows / Runpod)

Environment variables

Auto-detect

Ports

CLI tools

Customize agent behavior

MLX (macOS / Apple Silicon)

Resources

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 40

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages