Case study · 01 · Live in production

Gateway.

An AI-native platform where engineers design distributed systems on a visual canvas and ship them to real cloud infrastructure with one click.

21+

Microservices

One-click

Cloud deploy

5

AI agents

Live

Status

The problem

Designing distributed systems is disconnected from shipping them.

Engineers whiteboard architectures in Lucid or Excalidraw, then translate them by hand into Terraform, Helm charts, and service scaffolds. Every translation is a chance to drop a dependency, forget a security group, or quietly pick a different region than the diagram implied. The whiteboard and the production system drift apart the moment the first ticket is closed.

There's also no feedback loop during the design itself. The engineer drawing the diagram doesn't know it has a single point of failure, a scaling ceiling, or an unnecessarily expensive component — they find out during code review, in staging, or worst case, in the post-incident retro. The cost of a bad call at whiteboard time compounds through every stage after it.

Gateway closes the gap.Every component on the canvas is a real cloud primitive — an EKS cluster, an RDS instance, an ALB — not a generic shape. AI agents (Guide, Tutor, Reviewer, Generator, Atlas) review the design as it's drawn: they flag bottlenecks, suggest alternatives, and answer questions about the trade-offs. When the design is ready, one click exports deployable Terraform that provisions the exact architecture the canvas described. Whiteboard and production stay in sync, because they're the same artifact.

Interactive demo

Failure cascade simulator.

A mini version of the analysis Gateway runs on real architectures. Click any service to simulate it going down — the simulator walks through the system and shows which other services would go dark with it, plus the percentage of the product affected. Same visual engine that powers Gateway's main canvas.

Interactive cascade simulator — open on desktop to click nodes and watch failures propagate.
Architecture

How it's built.

Microservices

21+ FastAPI services, gateway-shared lib, RabbitMQ event bus.

AWS EKS + Helm

Helm-managed deploys per service, per environment.

AI layer

LangChain for retrieval + agent routing, pluggable LLM providers.

Observability

OpenTelemetry traces, Prometheus metrics, Grafana dashboards.

The AI layer

Five agents, one brain.

Each agent has a single job, a tight system prompt, and access to the canvas state via tool calls. They share a retrieval layer over the user's architecture + a curated corpus of system-design references.

Guide

Walks users through designing a system, asks clarifying questions about scale and constraints.

Tutor

Explains why a component is there, what it does, and what the trade-offs are.

Reviewer

Audits a finished canvas for bottlenecks, single points of failure, and cost red flags.

Generator

From a natural-language problem statement, generates an initial architecture on the canvas.

Atlas

A chat interface that can pull from all of the above — SSE streaming, grounded in the user's canvas state.

What it taught me

Wearing every hat.

  • Service boundaries and event-driven contracts across 21+ services.
  • Designing LLM agents with guardrails and grounded retrieval.
  • Terraform module design — the export has to produce real, deployable IaC.
  • Helm chart hygiene and EKS deploy pipelines via GitHub Actions.
AWS EKSTerraformHelmReact + ViteFastAPILangChainRabbitMQPostgresGitHub Actions