For ComfyUI

ComfyUI on Edge,
image generation on your GPUs

Self-host the most flexible diffusion runtime on Edge GPU instances. SDXL, FLUX, ControlNets, custom workflows — checkpoints in S3, outputs delivered through the global CDN.

Start your trial See AI infrastructure

# Provision a GPU VM

$ edge compute create \

--image ubuntu-24-04-cuda --plan gpu-l4 \

--script ./bootstrap-comfy.sh

# Mount Edge Storage as models/

$ s3fs models /opt/comfy/models \

-o url=https://storage.edge.run

# Run headless

$ python main.py --listen 0.0.0.0 --port 8188

# Trigger a workflow from your app

POST https://comfy.example.com/prompt

{ "prompt": { ...workflow JSON... } }

Why teams self-host ComfyUI

Maximum flexibility per GPU dollar, with no per-image bills.

Any diffusion model

SDXL, FLUX, SD3, AnimateDiff, video models — if it has a checkpoint, ComfyUI runs it. The most flexible diffusion runtime.

GPU VMs from L4 to H100

Pick the right card for your model. SDXL flies on an L4; FLUX wants an A10/A100; video models want H100. Sized to fit your workflow.

Checkpoints + LoRAs in S3

Mount Edge Storage as the `models/` directory. Checkpoints, LoRAs, ControlNets, VAEs all centrally stored — share across multiple GPU VMs.

Headless API for production

ComfyUI exposes a REST API for triggering workflows. Pair with a small Node/Python service for queue management and you have a production image-gen pipeline.

Custom nodes welcome

The ComfyUI ecosystem is huge — IPAdapter, ControlNet, AnimateDiff, custom samplers. All install via `git clone` into `custom_nodes/`.

Per-image bills, gone

Replicate, fal.ai, Together charge per image. An Edge GPU VM running ComfyUI is one fixed monthly fee — generate millions of images for the same bill.

Reference architecture

How ComfyUI maps to Edge

GPU VM(s) for inference, S3 for shared model weights, CDN for delivery of finished images. Add a small queue service for production volume.

Compute (GPU)

ComfyUI on a GPU VM, scaled horizontally for throughput

Storage

S3-compatible bucket for checkpoints, LoRAs, outputs

CDN

Serves generated images globally with image optimisation

Image Optimization

On-the-fly resize/format conversion of generated outputs

DNS

Anycast DNS for `comfy.example.com` (admin) and `cdn.example.com` (outputs)

Indicative cost

~50k SDXL generations / month

Replicate (SDXL) ~$200–500/mo

fal.ai (SDXL) ~$150–400/mo

Edge GPU VM (L4) flat fee

Edge wins decisively as volume grows past the GPU's monthly cost.

Common questions

How does this compare to Replicate / fal.ai?

Cheaper per image at scale, more flexible (any custom workflow you can build in the UI), and your prompts/outputs stay private. Trade-off: you handle scaling and queue management — but for steady-volume workloads it pays back fast.

How do I expose it as an API to my app?

Run ComfyUI in `--listen` mode behind the Edge CDN. Your app POSTs workflow JSON to `/prompt`; ComfyUI returns a job ID; poll for completion or use the WebSocket. Lots of community wrappers exist.

How do I share models across multiple GPUs?

Mount Edge Storage as the `models/` directory on every ComfyUI VM. Adding a new GPU is then just spinning up a VM — no model copying.

What about NSFW / safety filters?

ComfyUI doesn't enforce filters by default — that's your call as the operator. If you're building a public product, add a moderation step (CLIP-based classifier or external API) into the workflow.

By Stack