Tokenmaxxing: Is AI Token Consumption a Productivity Metric or Vanity Trap?
In Silicon Valley, a new term has become a subject of controversy: “tokenmaxxing.” The term refers to the approach of maximizing the consumption of AI tokens — those quantifiable units of information that AI models process — within a company as much as possible. OpenAI estimates that one token corresponds to approximately four characters. What began as a technical metric has since become a creed of a new work culture — and a point of friction in an industry wrestling with the question of how AI productivity can actually be measured.
Tokens: The Basic Unit of the AI Industry
Tokens are the smallest units of information into which AI language models break down text in order to process it. A token can be a short word, a syllable, part of a word, or even just a single punctuation mark. The word “Haus” is typically processed as a single token, whereas longer or compound terms such as “Donaukraftwerk” break down into several sub-tokens. OpenAI’s rule of thumb is that one token corresponds to approximately four characters in English; in German, the figure tends to be somewhat higher due to longer words and umlauts, meaning a token there often covers only one syllable. An average A4 page of running text can thus quickly amount to 500 to 800 tokens.
Commercial AI providers such as OpenAI, Anthropic, and Google bill by tokens — in two directions: input tokens encompass everything the user or an agent sends to the model — prompts, system instructions, uploaded documents, and the conversation history so far. Output tokens are what the model generates — answers, code, or text. Both are billed separately, with output tokens typically being significantly more expensive than input tokens. Prices range, depending on the model class, from a few cents to several dollars per million tokens — more capable models such as Claude Opus or GPT-5 cost many times more than simpler variants.
With agentic systems that work autonomously in the background, reading documents and calling tools, consumption adds up quickly: if a coding agent runs autonomously for hours, repeatedly processing the same code context, monthly bills can quickly reach five- or six-figure sums — as the examples of Cleo and Anthropic illustrate.
The Trigger: Meta’s “Claudenomics” Leaderboard
The debate gained momentum when the industry publication The Information reported in early April on an internal dashboard at Meta Platforms. An employee had independently set up a leaderboard there — internally called “Claudenomics” — that ranked colleagues by their individual token consumption and awarded titles such as “Token Legend.” According to The Information, Meta employees consumed a total of 60 trillion tokens in 30 days. The top user among Meta’s 250 power users reportedly consumed an average of 281 million tokens in a month — a volume that can translate into costs running into the millions.
The dashboard has since been taken offline by the employee responsible. Meta points to a separate AI Insights dashboard that captures AI usage more holistically. By then, however, the public attention had long since triggered a fundamental debate.
The Proponents: “Existential” for Competitiveness
For a number of AI-first companies, tokenmaxxing is not a gimmick but a survival strategy. May Habib, co-founder and CEO of enterprise AI startup Writer, describes internal AI usage as existential for her company. Writer itself operates a token leaderboard, whose top performers in March consumed just under 11 billion and just over 6 billion tokens respectively. On Writer’s internal platform, 10 billion tokens cost slightly more than $50,000. Habib openly acknowledges that the metric is susceptible to manipulation and that not every token generates business value — she consciously accepts this.
Even more outspoken is Barney Hussey-Yeo, founder and CEO of fintech app Cleo, which is currently valued at one billion US dollars. At Cleo, non-engineers are permitted to spend up to $1,000 per month on tokens, engineers up to $2,000. Hussey-Yeo himself reportedly spent the equivalent of over $36,000 on tokens in a single month by running several agents in parallel. His credo: anyone who does not use Claude Code to improve productivity and ways of working will not make it. Within his 178-strong engineering department, he observes a growing divide between “AI-native” employees and “laggards.”
Nvidia founder Jensen Huang also made a pointed remark on the All-In Podcast: he said he would be concerned if an engineer earning $500,000 in salary did not consume at least $250,000 worth of tokens. Reports by the New York Times about individual power users — an OpenAI engineer processed 210 billion tokens in a week, an Anthropic employee ran up a Claude Code bill of $150,000 in a month — lend credence to the thesis that high token consumption has come to be seen as a badge of engagement.
The Critics: “Outcome Maxxing” Instead of Token Maxxing
On the other side, broad opposition is forming. Yamini Rangan, CEO of HubSpot, distilled the counter-position into a formula on LinkedIn: “Outcome maxxing >> token maxxing.” Andrew Lau, co-founder and CEO of Jellyfish, warns that one can tokenmax all day and still produce undesirable results. Brian Elliott, CEO of enterprise AI firm Blitzy, compares the metric to the idea of measuring company revenue by the number of cold calls made — an inadequate measure.
Particularly drastic is the comparison made by Matt Calkins, CEO of Appian: he equated tokenmaxxing with the Soviet practice of judging the quality of chandeliers by their weight. Jim Rowan, Principal and U.S. Head of AI at Deloitte Consulting, takes a more nuanced view of the phenomenon: while the approach reflects a legitimate desire to incentivize AI usage, it risks turning tokens into a vanity metric because it does not distinguish between mere usage and actual value contribution.
Stefan Camilleri, VP Engineering at Typeform, emphasizes that what matters is not token volume but the value generated per token. Jitender Aswani, VP Engineering at data platform provider Starburst, valued at $3.35 billion, pursues a middle path: no limits, but no forced maximization either. Internally, the principle is referred to as “let a thousand flowers bloom.” Hard metrics used include DORA metrics, developer velocity, code quality, and incident resolution times. Since December, time-to-production at Starburst has reportedly fallen by 60 percent; a third of the code is now generated by Claude.
Salesforce Pushes Ahead with Its Own Alternative
Into this contested landscape, Salesforce now enters with its own proposal: Agentic Work Units (AWUs). The metric is intended to translate AI inputs such as tokens and compute into concrete outputs — that is, work actually completed. According to Salesforce, Singapore Airlines uses AWUs to measure the processing time of customer service requests; Williams Sonoma uses them to track how AI agents derive product recommendations. According to company figures, 2.4 billion AWUs were generated on the platform by the fourth quarter, with triple-digit year-on-year growth.
Madhav Thattai, Executive Vice President and GM of Salesforce AI, sums up the logic: endless Claude Code loops without customer benefit are worthless. The goal must not be called “agentic transformation” but must be tied to customer satisfaction.
Criticism of AWUs is not absent: Salesforce defines the formula, determines what counts as a unit, and controls the benchmarks itself. Without external verification, this metric too could become a vanity metric.
Assessment: An Industry in Search of Its KPIs
The tokenmaxxing debate is, at its core, a debate about measurability. In an economic environment with greater board scrutiny, pure token consumption without demonstrable business benefit quickly resembles conspicuous consumption. At the same time, proponents argue not implausibly that barriers to AI usage in early adoption phases must be deliberately kept low in order to foster a spirit of experimentation.
The fact that personal anecdotes are also becoming part of the scene’s folklore — Imbue co-founder Josh Albrecht, for instance, declared at an Axios event that he had not shaved because he was shipping so much code with Claude; Brian Alvey, CTO of WordPress VIP, admitted that thinking about token consumption left him “breathless” — shows how much the topic has by now also taken on a cultural dimension within the tech industry.
What remains is the open question of which metric will prevail: raw token consumption, output-oriented metrics such as AWUs, or classic engineering KPIs. One thing is clear: one year after the breakthrough of agentic AI systems, the industry has yet to reach a consensus on how to cleanly quantify the economic value of these tools.


