For ComfyUI
ComfyUI on Edge,
image generation on your GPUs
Self-host the most flexible diffusion runtime on Edge GPU instances. SDXL, FLUX, ControlNets, custom workflows — checkpoints in S3, outputs delivered through the global CDN.
# Provision a GPU VM
$ edge compute create \
--image ubuntu-24-04-cuda --plan gpu-l4 \
--script ./bootstrap-comfy.sh
# Mount Edge Storage as models/
$ s3fs models /opt/comfy/models \
-o url=https://storage.edge.run
# Run headless
$ python main.py --listen 0.0.0.0 --port 8188
# Trigger a workflow from your app
POST https://comfy.example.com/prompt
{ "prompt": { ...workflow JSON... } }
Why teams self-host ComfyUI
Maximum flexibility per GPU dollar, with no per-image bills.
Any diffusion model
SDXL, FLUX, SD3, AnimateDiff, video models — if it has a checkpoint, ComfyUI runs it. The most flexible diffusion runtime.
GPU VMs from L4 to H100
Pick the right card for your model. SDXL flies on an L4; FLUX wants an A10/A100; video models want H100. Sized to fit your workflow.
Checkpoints + LoRAs in S3
Mount Edge Storage as the `models/` directory. Checkpoints, LoRAs, ControlNets, VAEs all centrally stored — share across multiple GPU VMs.
Headless API for production
ComfyUI exposes a REST API for triggering workflows. Pair with a small Node/Python service for queue management and you have a production image-gen pipeline.
Custom nodes welcome
The ComfyUI ecosystem is huge — IPAdapter, ControlNet, AnimateDiff, custom samplers. All install via `git clone` into `custom_nodes/`.
Per-image bills, gone
Replicate, fal.ai, Together charge per image. An Edge GPU VM running ComfyUI is one fixed monthly fee — generate millions of images for the same bill.
Reference architecture
How ComfyUI maps to Edge
GPU VM(s) for inference, S3 for shared model weights, CDN for delivery of finished images. Add a small queue service for production volume.
ComfyUI on a GPU VM, scaled horizontally for throughput
S3-compatible bucket for checkpoints, LoRAs, outputs
Serves generated images globally with image optimisation
On-the-fly resize/format conversion of generated outputs
Anycast DNS for `comfy.example.com` (admin) and `cdn.example.com` (outputs)
Indicative cost
~50k SDXL generations / month
Edge wins decisively as volume grows past the GPU's monthly cost.
Common questions
How does this compare to Replicate / fal.ai?
Cheaper per image at scale, more flexible (any custom workflow you can build in the UI), and your prompts/outputs stay private. Trade-off: you handle scaling and queue management — but for steady-volume workloads it pays back fast.
How do I expose it as an API to my app?
Run ComfyUI in `--listen` mode behind the Edge CDN. Your app POSTs workflow JSON to `/prompt`; ComfyUI returns a job ID; poll for completion or use the WebSocket. Lots of community wrappers exist.
How do I share models across multiple GPUs?
Mount Edge Storage as the `models/` directory on every ComfyUI VM. Adding a new GPU is then just spinning up a VM — no model copying.
What about NSFW / safety filters?
ComfyUI doesn't enforce filters by default — that's your call as the operator. If you're building a public product, add a moderation step (CLIP-based classifier or external API) into the workflow.
By Stack
Other stacks on Edge
Generate images on your terms
30-day trial. Compute team can size the right GPU for your workflow.