Gideon

oh you clicked the moon. it's gideon. what do you want

Shift+Enter for new line

back to blog

April 2026

AI Hosting Is Broken — So We Built a Gateway Instead

4 min read/infrastructure

Everyone is deploying AI agents like they're web apps. Spinning up EC2 instances, configuring Kubernetes pods, paying for containers that sit idle 23 hours a day. And nobody's asking the obvious question: why?

The average person uses their AI agent — whether it's a custom GPT, a LangGraph workflow, or a CrewAI bot — for about 20 minutes a week. Twenty minutes. The other 10,060 minutes, your $49/month server is doing absolutely nothing. That's a 99.7% idle rate, and you're paying for every second of it.

That's the problem we started with when we built Maritime.

The Mismatch

Traditional cloud hosting was designed for web servers — processes that need to be alive constantly, handling a steady stream of requests. That model made perfect sense for HTTP APIs and web apps.

AI agents are nothing like that. They're bursty. Someone asks their agent a question, the agent thinks for 30 seconds, calls a few tools, returns a response, and then... silence. Maybe for hours. Maybe for days. The usage pattern is fundamentally different, but we're cramming it into infrastructure built for a completely different workload.

The result is predictable:

cold starts kill agents — Lambda functions timeout during extended reasoning. your agent is mid-thought and the container gets recycled.

state disappears — serverless is stateless by design. agents need memory, conversation context, tool state. every restart is amnesia.

costs make no sense — per-request pricing punishes agent loops. a single task that takes 15 tool calls costs 15x what a simple API endpoint would. always-on VMs charge you for dead air.

setup is absurd — Dockerfile, cloud config, domain, SSL, CI/CD pipeline. just to run a Python script that answers questions about your calendar.

The Insight

The realization was simple: you don't need to host an agent. You need a gateway to one.

Think about it. If your agent is idle 99% of the time, keeping it running is like leaving your car engine on in the driveway because you might need to drive to the store later. What you actually need is a system that can start the car instantly when you need it and turn it off the moment you're done.

The best infrastructure for something that runs 20 minutes a week isn't a server. It's a doorbell with a very fast wake-up.

That's the core architecture of Maritime: a stable HTTPS endpoint (the gateway) that sits in front of your agent container. When a request comes in, the gateway wakes your agent in milliseconds, routes the request, keeps the container alive for the duration of the interaction, and puts it back to sleep when it's done. Your agent has a permanent address, but it only consumes resources when it's actually thinking.

Sleep/Wake Over Always-On

The sleep/wake model changes the economics entirely. Instead of paying for a server that's alive 24/7, you pay $1/month for the gateway and only burn compute when your agent is actually doing work. Not idle. Not waiting. Working.

But the harder problem was making the wake-up fast enough that it doesn't feel like a cold start. If your agent takes 5 seconds to boot, that's 5 seconds of dead air before it can respond to a webhook or an API call. That's unacceptable.

So we built the container layer to preserve state across sleep cycles. Your agent goes dormant with its memory, its loaded models, its context — all intact. When it wakes, it's not booting from scratch. It's resuming. The difference between a cold start and a resume is the difference between starting your car and waking it from sleep mode.

What This Actually Looks Like

You push a GitHub repo. Maritime detects your agent framework — CrewAI, LangGraph, OpenAI Agents SDK, bare FastAPI, whatever — builds the container, gives you an endpoint. That's it. No Dockerfile required (though we support them). No YAML. No cloud console.

stable endpoint — your agent gets a permanent URL. point webhooks at it, call it from other agents, integrate it anywhere.

encrypted secrets — API keys injected at runtime through the dashboard. no .env files in your repo.

stateful containers — context persists across requests. your agent remembers the conversation it was having.

live logs — watch your agent think in real time. see every tool call, every decision.

The point isn't to be another hosting platform. The point is to be the right kind of hosting platform — one that matches how agents actually behave instead of forcing them into infrastructure designed for a different era.

Why This Matters Now

We're at a weird inflection point. The frameworks for building agents are maturing fast — CrewAI, LangGraph, the Agents SDK — but the infrastructure for running them in production is still stuck in 2020. People are building sophisticated multi-agent systems and then deploying them on the same infrastructure they'd use for a to-do app.

The gap between "works on my laptop" and "runs in production" shouldn't require a DevOps engineer. Especially not for something that's active 20 minutes a week.

We're in private beta at maritime.sh. If the idea of paying $1/month instead of $49 for an agent that sleeps 99% of the time sounds reasonable to you, come check it out.

Album cover
GothSidewalks and Skeletons
0:000:00