launchthatbot

How to Run a Squad of AI Agents for $25/mo

A Hetzner VPS for $5. A MiniMax subscription for $20. Three agents with different roles sharing memory through Convex. Here is the exact setup.

Feb 19, 2026LaunchThatBot Team

TOFUIndie Devs

Ready to apply this in your own deployment?

Start your squad

The most common objection to running AI agents is cost. People assume you need expensive API keys, beefy servers, and a budget that looks like an enterprise line item.

You do not.

Here is a concrete setup that runs three AI agents with different roles and personalities, shared memory across all of them, and a real management dashboard -- for $25 per month total.

No tricks. No asterisks. This is what it actually costs.

The bill

Line item	Monthly cost
Hetzner CX22 VPS (2 vCPU, 4 GB RAM, 40 GB SSD)	$5
MiniMax subscription (300 questions every 5 hours)	$20
LaunchThatBot (free tier)	$0
Convex (free tier)	$0
Total	$25

That is two services with free tiers doing heavy lifting and two paid services that are cheap enough to run from a side project budget.

Step 1: sign up for LaunchThatBot

Create an account on LaunchThatBot. The free tier gives you access to the deployment dashboard, Convex Mode, and agent management. You do not need a paid plan to run what we are building here.

Once you are in, you have a dashboard that will show your servers, deployments, and agent activity. This replaces SSH, log tailing, and the mental overhead of remembering what is running where.

Step 2: deploy a Hetzner server

From the LaunchThatBot dashboard, create a new deployment:

Provider: Hetzner
Server type: CX22 (2 vCPU, 4 GB RAM)
Region: Pick the one closest to you. Nuremberg (nbg1) and Falkenstein (fsn1) are solid defaults for Europe. Ashburn (ash) if you are in the US.
Security profile: Baseline hardened

LaunchThatBot handles the rest. It provisions the VPS, installs Docker, deploys OpenClaw in a container with a tested security configuration, sets up the reverse proxy, and configures TLS. You do not touch a terminal.

The CX22 is Hetzner's entry-level VPS at roughly $5/month. Two vCPUs and 4 GB of RAM is more than enough for three OpenClaw agents. These agents are not running inference locally -- they are making API calls to an external LLM provider. The server's job is to run the agent runtime, manage tool execution, and handle message routing. That is lightweight work.

Step 3: enable Convex Mode

When you set up the deployment, enable Convex Mode and connect your Convex project. If you do not have one, create a free account at convex.dev -- the free tier is generous enough for this setup.

Convex Mode gives your agents something most solo-deployed agents do not have: a shared backend.

Here is what matters for a three-agent squad:

Shared memory

All three agents read from and write to the same Convex tables. When Agent A learns something, Agents B and C can access that knowledge immediately. This is not a flat text file or a vector database bolted on after the fact. It is structured, queryable, real-time data.

For example, if your research agent discovers a key fact about a topic, it writes that to a shared memory table. Your writing agent can query that table when drafting content. Your review agent can cross-reference the same data when checking accuracy. All three agents operate with a shared understanding of what the squad knows.

Event logging

Every agent action, tool execution, and error gets logged to structured Convex tables. You can see exactly what each agent did, when, and whether it succeeded -- from the LaunchThatBot dashboard or directly in the Convex console. No more grepping log files on a remote server.

Persistence across restarts

If your server restarts (planned maintenance, unexpected reboot, deployment update), your agents' operational state survives. The in-flight context is in Convex, not in local memory. Agents pick up where they left off instead of starting from scratch.

Step 4: sign up for MiniMax

This is where the economics get interesting.

MiniMax offers an LLM subscription at $20/month that gives you 300 questions every 5 hours. That is not 300 questions per month -- it is 300 questions per rolling 5-hour window. Over a full day, that is roughly 1,440 questions. Over a month, tens of thousands.

The critical detail: you use the same MiniMax subscription for all three agents. You are not paying per agent. You are paying for API access, and all three agents share that access through the same API key configured on your OpenClaw deployment.

This is fundamentally different from per-token pricing models like OpenAI or Anthropic, where running three agents means roughly 3x the cost. With MiniMax's subscription model, the cost is flat. Three agents cost the same as one.

Is 300 questions per 5 hours enough?

For most use cases, yes. Here is the math:

300 questions per 5 hours = 60 questions per hour per window
Split across 3 agents = 20 questions per hour per agent on average

Twenty questions per hour is more than enough for agents that are doing research, writing content, monitoring systems, or handling asynchronous tasks. These agents are not chatbots handling rapid-fire conversations. They are workers processing tasks at a sustainable pace.

If your agents hit the rate limit, they queue and wait for the next window. For asynchronous workloads, this is a non-issue. The agents work in bursts, the rate limit refreshes, and the squad keeps moving.

Step 5: set up three agents with different roles

Now the fun part. You have a server, a shared backend, and an LLM subscription. Time to define your squad.

Here is an example squad that works well within the $25/month budget:

Agent 1: the researcher

Role: Gathers information, monitors sources, collects data
Personality: Thorough, methodical, prefers breadth over depth on first pass
Tools: Web browsing, RSS monitoring, API queries
Shared memory writes: Raw findings, source links, key facts, timestamps

The researcher is your information ingest engine. It watches the sources you care about and writes structured observations to shared memory. It does not make decisions about what to do with the information -- it just collects and stores.

Agent 2: the creator

Role: Produces content, drafts responses, generates artifacts
Personality: Clear communicator, concise, adapts tone to audience
Tools: Text generation, formatting, template application
Shared memory reads: Research findings, style guidelines, previous outputs
Shared memory writes: Drafts, final outputs, revision notes

The creator reads what the researcher found and turns it into something useful. Blog posts, email drafts, social media content, documentation updates, reports -- whatever your use case needs. It reads from shared memory to stay informed and writes its outputs back so the squad has a record.

Agent 3: the reviewer

Role: Quality control, fact-checking, consistency verification
Personality: Critical but constructive, detail-oriented, flags issues clearly
Tools: Comparison, cross-referencing, checklist evaluation
Shared memory reads: Research data, creator outputs, historical quality records
Shared memory writes: Review results, approved/rejected flags, improvement suggestions

The reviewer reads the creator's outputs and cross-references them against the researcher's data. It checks for accuracy, consistency, and quality. When something passes review, it flags it as approved. When something needs work, it writes specific feedback that the creator can act on.

How the squad collaborates

The workflow is sequential but loosely coupled:

The researcher finds something noteworthy and writes it to shared memory
The creator sees new research data and produces a draft
The reviewer evaluates the draft against the research and writes a verdict
If the draft passes, it is ready for use. If not, the creator revises based on the reviewer's feedback

All of this happens through Convex's shared memory tables. No custom message passing. No webhooks. No external queue. The agents read and write to the same database, and the orchestration emerges from the data flow.

What $25/month does not get you

Being honest about the limits:

Instant responses. With a shared rate limit of 300 questions per 5 hours, your agents work at a measured pace. This is not a real-time chatbot setup. It is an asynchronous work squad.

Unlimited scale. Three agents on a CX22 with a $20 MiniMax subscription is right-sized for side projects, content operations, and small-scale automation. If you need 20 agents processing thousands of requests per hour, you need a bigger server and a bigger LLM budget.

The most capable models. MiniMax is good. It is not GPT-4 or Claude Opus. For the tasks described here -- research, content creation, review -- it is more than capable. For tasks that require frontier-model reasoning, you would need a different provider and a different budget.

GPU inference. Your agents call an external API. They do not run models locally. If you need local inference, you need GPU hardware, which is a different cost structure entirely.

What $25/month does get you

Three agents running 24/7. Not three terminals you have to keep open. Not three processes you have to manually restart. Three agents on a managed server with automatic health checks and restart logic.

Shared knowledge. Every agent benefits from what the other agents learn. The researcher's findings flow to the creator. The reviewer's feedback improves the next draft. This compounding effect is what makes a squad more valuable than three isolated agents.

A real dashboard. See what your agents are doing, check their health, review their event logs, manage their configurations. All from the LaunchThatBot dashboard. No SSH, no terminal, no guessing.

Your data, your control. The Convex backend is yours. The server is yours. If you decide LaunchThatBot is not for you, disconnect and everything keeps running. Nothing is locked in.

Room to grow. When you are ready for more agents, a bigger server, or a different LLM provider, the architecture scales with you. Add a second server. Switch to a pay-per-token model for specific agents that need it. Mix and match. The squad pattern does not change.

The comparison nobody makes

Here is what a roughly equivalent setup costs on other approaches:

Three separate OpenAI API agents: At typical usage, $30-80/month in API costs alone, plus $15-20 for a DigitalOcean droplet, plus your own infrastructure code. Total: $50-100+/month with significant development overhead.

Three agents on a managed AI platform: Most managed platforms charge per agent or per seat. Three agents with shared state typically runs $50-150/month before infrastructure costs.

The LaunchThatBot + Hetzner + MiniMax stack: $25/month. Shared memory included. Dashboard included. Security hardening included.

The economics work because each piece of the stack is optimized for cost at low scale. Hetzner is the cheapest reputable VPS. MiniMax is flat-rate instead of per-token. Convex and LaunchThatBot have free tiers that cover this use case. None of these are compromises -- they are the right tools for a solo builder running a small squad.

Getting started

Here is the order of operations:

Sign up for LaunchThatBot -- create your account, you are on the free tier
Create a Convex project -- free tier at convex.dev
Deploy a Hetzner CX22 -- from the LaunchThatBot dashboard, takes about 5 minutes
Enable Convex Mode -- connect your Convex project during deployment setup
Sign up for MiniMax -- $20/month subscription, add the API key to your OpenClaw config
Define your three agents -- set roles, personalities, and tools through the dashboard
Start the squad -- your agents begin working, sharing memory, and collaborating

Total setup time: under 30 minutes if you already have the accounts. Under an hour if you are starting from scratch.

Total monthly cost: $25.

Total agents running: 3, with shared memory, managed infrastructure, and a dashboard to keep track of everything.

That is a production agent squad for less than most people spend on coffee in a week.

Here is how to start your first deployment.