The true economics of AI agents: it's data retrieval, not the model
A technical discovery circulating in the industry shifts the perspective on the real cost of AI agents: most companies are hemorrhaging money on tokens by indiscriminately loading 50,000 tokens of context per request. The standard workflow? Retrieve 200 documents by cosine similarity, dump everything into the LLM, and hope the right answer is in there.
The problem: it’s terribly inefficient. An agent processing 100 requests per day at this pace burns through millions of unnecessary tokens each month. This isn’t a flaw in the AI model itself—it’s failed architecture upstream.
The solution? Optimize the retrieval layer: use smarter ranking techniques, pre-filter relevant documents before passing them to the model, and strip the context of all noise. Empirical data shows a 90% reduction in token consumption is possible without sacrificing quality.
This has a direct implication: you’re currently paying 9 to 10 times more than technically necessary to run your agents. This isn’t an inherent AI limitation—it’s an implementation flaw that an SMB can fix itself or demand from its integrator.
What this means for your business
Where to act immediately: If you’ve deployed AI agents (customer chatbots, sales assistance, search automation), check how you’re retrieving and filtering data before sending it to the model. An SMB processing 500 requests per day can save €10,000 to €20,000 per month just by optimizing this step. Before switching to a “cheaper” model, audit your retrieval pipeline. With a competent integrator, this optimization takes 2-3 weeks. That’s immediate ROI, no technology change required.
In brief
Granola raises $125M, hits $1.5B valuation
The meeting notes tool becomes an enterprise AI agent platform. What changes for SMBs: vertical AI tools are gaining funding credibility. If your business is meeting or project management, waiting for Granola’s new agents becomes a viable option rather than coding everything on ChatGPT.
Apple opens Siri to third parties (Claude, Gemini, etc.) on iOS 27
Users will choose which chatbot powers Siri. For an SMB with a mobile app, this means you can now integrate your AI agent directly into Apple’s system interface without going through ChatGPT’s silo. Free distribution guaranteed if your solution is solid.
Meta consolidates AI with 4 acquisitions in 4 months
Manus (autonomous web agents), Moltbook, and Alexandr Wang (Scale AI) join Meta. The signal: big tech sees autonomous AI agents as the next frontier. For SMBs, this confirms that investing in AI agents isn’t a trend—it’s an industrial direction.
Google Gemini offers memory import from other chatbots
You can migrate your ChatGPT history and customized data to Gemini effortlessly. Implication: “data lock-in” is decreasing. If you feared OpenAI dependency, major alternatives now offer viable migration bridges.
AI Commerce: executing itineraries, not just recommendations
The AI agent shifts from “advisor” (you must click) to “executor” (it’s done). An agent says “book me a trip to Italy in my budget” and it buys the ticket. For e-commerce and services, that’s the real change: your customers will expect this in 12 months.
Get The AI Brief in your inbox
3x per week, the essentials of AI decoded for business leaders.