The mirage of ‘tokenmaxxing’
As Australian organisations set their AI budgets for the new financial year, they are recognising this challenge and shifting to measuring success based on AI output and impact, rather than simply consumption, says Stu Scotis, Deloitte Australia chief technology officer and Deloitte Global Agentic AI leader.
Sometimes tokenmaxxing is quite intentional, Scotis explains, especially when organisations use leaderboards or gamified usage. Other times it can be more subtle, as users start to use background agents to perform unnecessary tasks or provide complex prompting and context that result in high token use.
“The business impact goes well beyond the cost of tokens, as poorly managed AI spend can erode value and distract teams on false measures of AI adoption,” he says.
“Well-managed usage gives organisations visibility of the AI value chain which is where, what and how much AI is executing across the organisation and the costs. This transparency enables forecasting and stronger decisions about where AI is delivering growth, efficiency, and measurable returns, or intervene if needed.”
The impact of optimising AI usage for appearance over actual business value is amplified by the nature of AI billing models. The more data processed by an AI query, the more AI tokens consumed and the greater the cost.
This cost pressure is also reshaping the broader IT budget conversation. AI spend is variable and unbudgeted in ways that legacy SaaS licensing never was, leading CFOs to scrutinise the entire technology stack and accelerate rationalisation of overlapping point solutions.
Moving beyond the compute problem
When business AI costs rise, the instinct is to treat it purely as an unavoidable compute problem – blaming GPU time, model licensing and inference costs. While these compute pressures are real for CFOs, treating excessive AI token consumption as only as a compute problem misses the point, says Elastic ANZ country manager Jeremy Pell.
Jeremy Pell, Elastic country manager for Australia and New Zealand.
Rather than relying on complex AI queries that draw on vast amounts of business data, the key to curbing token consumption is to present Large Language Models (LLM) with refined “hyper-contextualised data”, Pell says.
“The compute explosion is happening further upstream at the retrieval layer,” he explains. “If an enterprise AI system relies on low-quality, redundant or poorly scoped data retrieval, it forces the LLM to consume exponentially more compute power and tokens to reach a usable answer.
“At this point you aren’t just paying to compute, you are paying to compute junk data.”
Building a superior data foundation
Addressing spiralling AI token costs requires harmonised and integrated enterprise data, as well as a robust semantic layer that retrieves data for LLMs efficiently and effectively. This ensures AI consumes the optimal number of tokens to only process the right contextual data.
“Australian organisations don’t need a massive flood of model tokens for every query; they just need the exact, right drop of hyper-contextualised data,” Pell says.
“That’s why true AI visibility is an observability problem before it’s a budget problem.”
Distilling this hyper-contextualised data requires a single source of truth, which is especially challenging when unstructured data comprises about 90 per cent of enterprise data and grows three times faster than structured data.
Monitoring all data sources, including cloud, on-premises and endpoints, is critical for performance management and cost control. By consolidating onto a single platform where search, retrieval, observability and security run on a single data store, enterprises gain the transparency required to ensure that AI tokens are not wasted.
“Every organisation obviously needs to manage costs, but you can’t let this hold you back from being innovative,” Pell says. “You’ve still got to be out there trialling AI and using it across your business to unlock value.
“It’s not simply about addressing tokenmaxxing to regulate AI use, it’s about building a vastly superior data and retrieval foundation underneath AI to make it highly efficient and ensure your AI spend leads to business outcomes.”
For more information, visit www.elastic.co
Leave a comment