Yee Ha! Building Your Stable of AI Agents

Post header illustration suggesting a network of secret agents being polite to each other

Today I found and used an AI Agent based service that vastly exceeded my expectations! But my key takeaway wasn’t about the value of this particular service, it was about the number of virtual tokens it used (the apps own reporting of usage, not necessarily actual LLM tokens) that determined the basis of its monthly sub price.

If this service checks out over the next week or two I’ll still probably buy an annual sub of monthly tokens (whatever they are). It’s not huge relative commitment, but as we’ve seen with the compounding organizational costs from the profligate adoption of SaaS apps, the onrush of upcoming and extra useful AI agent services we will soon be adopting will start to add up in total. And as we’ve always said about ensuring proper capacity planning – someone will always care when the bill comes due!

So my weekend ?0.02 predictions (unresearched/raw) for the emerging AI Agent market:

Cost-efficient AI users will come to adopt an “AI stable” of multiple AI model services and AI agent services that they will manually stitch and segue together using relative human speed cut and paste techniques like google docs. This keeps us humans relevant as “AI conductors” for some time to come.

The SF idea of networks of agents inter-operating without intervention is pretty cool but imagine a whole host of pricey agents chatting away in the background at machine speeds, running up your credit card just being super-polite to each other like little automated chipmunks – ‘No, after you. No, I insist you go first. Not at all, you should go first…’

It will also be best practice (for years to come) to employ a decently large stable of AI models and agents of varying price points and capabilities to avoid sending all your work through a single über capable AI agent (agency?) whose high per-usage expense will drive up your costs.

Instead of wrapping up all our work into one vibe-coded mega prompt, us early AI users will need to learn to leverage free, small (perhaps local or in-house) LLMs to handle the wide, bottom of the pyramid sub-tasks, then steer appropriate requests to corporately scoped agents to incorporate RAG/IP data, and only call on that superhero solve-anything AI agent/agency for the most complex, challenging parts of your task du jour.

Why pay a super duper AI service token cost to suggest a better prompt, or just crawl a web page or make that MCP call to an API to a SaaS service you already pay for? Free models today can correct your grammar and format your output “in your style”.

Because we care about costs eventually, we will see organizations seeking the most cost-efficient approaches. And that will mean humans staying in the AI agency loop for some time. Someone I’m sure is already working on a cost optimizing AI conductor agent, but those will still occasionally hallucinate or go off the rails, and while we might be ok with iffy GPT output in the minor details of our larger work, it only takes one or two big dollar soaks (or security breaches) before management careers are impacted.

So, start building a stable of useful and productive AI model services and agents. Keep track of their relative costs, and learn how to orchestrate and conduct your work between those AI’s to produce the biggest bang at the smallest spend!

(no AI’s were harmed in the writing of this article although the post image is from craiyon.ai )

AI AI Agents Capacity Planning Cost Efficiency

Yee Ha! Building Your Stable of AI Agents