The true economics of AI agents: it's data retrieval, not the model

A technical discovery circulating in the industry shifts the perspective on the real cost of AI agents: most companies are hemorrhaging money on tokens by indiscriminately loading 50,000 tokens of context per request. The standard workflow? Retrieve 200 documents by cosine similarity, dump everything into the LLM, and hope the right answer is in there.

The problem: it’s terribly inefficient. An agent processing 100 requests per day at this pace burns through millions of unnecessary tokens each month. This isn’t a flaw in the AI model itself—it’s failed architecture upstream.

The solution? Optimize the retrieval layer: use smarter ranking techniques, pre-filter relevant documents before passing them to the model, and strip the context of all noise. Empirical data shows a 90% reduction in token consumption is possible without sacrificing quality.

This has a direct implication: you’re currently paying 9 to 10 times more than technically necessary to run your agents. This isn’t an inherent AI limitation—it’s an implementation flaw that an SMB can fix itself or demand from its integrator.

What this means for your business

Where to act immediately: If you’ve deployed AI agents (customer chatbots, sales assistance, search automation), check how you’re retrieving and filtering data before sending it to the model. An SMB processing 500 requests per day can save €10,000 to €20,000 per month just by optimizing this step. Before switching to a “cheaper” model, audit your retrieval pipeline. With a competent integrator, this optimization takes 2-3 weeks. That’s immediate ROI, no technology change required.

In brief

Granola raises $125M, hits $1.5B valuation

The meeting notes tool becomes an enterprise AI agent platform. What changes for SMBs: vertical AI tools are gaining funding credibility. If your business is meeting or project management, waiting for Granola’s new agents becomes a viable option rather than coding everything on ChatGPT.

Read source

Apple opens Siri to third parties (Claude, Gemini, etc.) on iOS 27

Users will choose which chatbot powers Siri. For an SMB with a mobile app, this means you can now integrate your AI agent directly into Apple’s system interface without going through ChatGPT’s silo. Free distribution guaranteed if your solution is solid.

Read source

Meta consolidates AI with 4 acquisitions in 4 months

Manus (autonomous web agents), Moltbook, and Alexandr Wang (Scale AI) join Meta. The signal: big tech sees autonomous AI agents as the next frontier. For SMBs, this confirms that investing in AI agents isn’t a trend—it’s an industrial direction.

Read source

Google Gemini offers memory import from other chatbots

You can migrate your ChatGPT history and customized data to Gemini effortlessly. Implication: “data lock-in” is decreasing. If you feared OpenAI dependency, major alternatives now offer viable migration bridges.

Read source

AI Commerce: executing itineraries, not just recommendations

The AI agent shifts from “advisor” (you must click) to “executor” (it’s done). An agent says “book me a trip to Italy in my budget” and it buys the ticket. For e-commerce and services, that’s the real change: your customers will expect this in 12 months.

Read source

The true economics of AI agents: it's data retrieval, not the model

What this means for your business

In brief

Granola raises $125M, hits $1.5B valuation

Apple opens Siri to third parties (Claude, Gemini, etc.) on iOS 27

Meta consolidates AI with 4 acquisitions in 4 months

Google Gemini offers memory import from other chatbots

AI Commerce: executing itineraries, not just recommendations

Get The AI Brief in your inbox

Ready to automate your repetitive tasks?