Menu
AI subscriptions are becoming less attractive

AI subscriptions are becoming less attractive

Maximilian Schwarzmüller

42,791 views 2 days ago

Video Summary

AI companies like Anthropic and GitHub are tightening usage limits and adjusting subscription plans, moving away from unlimited access and heavy subsidies. This shift is driven by the rising costs of AI inference, which have outpaced the revenue generated by current subscription models. As users engage in more complex and prolonged "agentic workflows," token consumption has drastically increased, making it unsustainable for companies to offer generous usage for fixed prices. The current API pricing reveals a significant gap between the cost of running AI models and the subscription fees, with many users consuming far more value than they pay for. This economic pressure, coupled with increasing compute scarcity and rising hardware costs, indicates a future of stricter limits and higher prices for AI subscriptions, particularly for professional and enterprise use.

An interesting fact is that OpenAI, despite its recent $122 billion funding, reportedly has only an 18-month runway, highlighting the urgent need for profitability and the unsustainability of current subsidy models.

Short Highlights

  • Anthropic is testing limiting "Cloud Code" access to more expensive subscription tiers, affecting 2% of new consumer signups.
  • GitHub is pausing new signups for Copilot Pro, Pro Plus, and student plans and tightening limits for individual plans, with Opus models no longer available in Pro plans.
  • The API pricing for Anthropic's Claude Opus 4.7 is $5 per million input tokens and $25 per million output tokens, while OpenAI's GPT-4.5 costs $2.50 per million input tokens and $22.50 per million output tokens.
  • The cost of AI models is driven by training (a one-time cost) and inference (an ongoing, per-request cost), with inference costs being the current economic challenge for subscription models.
  • Agentic workflows and long-running coding sessions consume significantly more tokens than occasional chat sessions, drastically increasing inference demands and making current subscription prices unsustainable.

Key Details

Anthropic's "Cloud Code" Test and Usage Restrictions [0:00]

  • Anthropic is reportedly testing a restriction on its "Cloud Code" feature, requiring users to be on more expensive subscription plans to access it.
  • This test affects only 2% of new consumer signups.
  • The company's response indicates it's a small test, but the move aligns with a broader trend of decreasing usage allowances and potentially degrading model performance across AI subscriptions.
  • Historically, Anthropic has also restricted usage of its subscription outside of "Cloud Code," such as with "OpenAI."

This morning I woke up seeing this post here on X which mentions that Anthropic seemingly pulled the Claw Code code plug from the pro plan.

GitHub's Shifting Copilot Strategy [0:49]

  • GitHub has published news indicating a shift in its Copilot strategy, including pausing new signups for Pro, Pro Plus, and student plans.
  • Usage limits for individual plans are being tightened.
  • Crucially, the Opus models are no longer available in Pro plans.
  • This change fits a larger narrative of AI companies re-evaluating their subscription economics.

And most importantly, that the Opus models are no longer available in the Pro plans.

The Economics of AI Subscriptions and Token Usage [01:59]

  • The current economic model for AI subscriptions relies on the majority of users not consuming their full allotted usage, similar to other subscription services like Netflix.
  • API pricing serves as a closer indicator of the true cost of AI requests. For instance, Anthropic's Claude Opus 4.7 has input token costs of $5 per million and output token costs of $25 per million.
  • OpenAI's GPT-4.5 has lower API prices: $2.50 per million input tokens and $22.50 per million output tokens.
  • It's assumed these API prices likely represent a break-even point or a small profit margin for the companies, focusing on inference costs.

Now, it's probably fair to assume that these API prices are prices that leave these companies at a break even point or a small profit regarding the their gross margin.

AI Cost Factors: Training vs. Inference [04:39]

  • The cost of running AI models is primarily influenced by two factors: training and inference.
  • Training is a significant, but largely one-time, cost for each model.
  • Inference, however, is an ongoing, per-request cost that occurs every time a user interacts with the AI.
  • Companies must at least break even on inference costs to avoid losing money on each request.
  • Long-term viability requires the gross margin from inference costs to cover training costs, staff, and other operational expenses.

Now for inference naturally that's different. This is an ongoing cost.

Subscription Value vs. API Costs [07:03]

  • For consumers, especially those using services like "Cloud Code" with high-end models like Opus, the subscription plans offered significant value compared to on-demand API pricing.
  • A $200 subscription plan, for example, could offer millions of tokens, which would cost significantly more if paid for at API rates (e.g., over 8 million output tokens at $25/million).
  • Past usage patterns, especially with prolonged sessions, often exceeded the value provided by subscription plans at their current pricing, making it unsustainable for providers.

Because of course with the max subscription for example for only 200 bucks you're getting lots of usage out of this plan.

Market Share Competition and Financial Runway [08:27]

  • AI companies face a dilemma: increasing prices too early could cede market share to competitors, but operating at a loss is unsustainable.
  • OpenAI's reported $122 billion funding provides only an estimated 18 months of runway, indicating a critical need to generate revenue and reduce subsidies.
  • The need to remain in business forces companies to eventually adjust their pricing and limit unsustainable usage.

So clearly you can't continue subsidizing all that usage forever because if you go out of business then all your customers are going to your competition anyway.

Compute Scarcity and Rising Hardware Costs [09:44]

  • The surge in AI development has led to a compute scarcity crisis, driving up prices for memory, networking gear, and energy needed for data centers.
  • Inference requires substantial memory, contributing to increased memory costs.
  • High demand for networking gear connecting clusters of chips for training and inference also drives up prices.
  • Data centers require significant energy, and the current grid infrastructure is often insufficient, leading to a need for off-grid energy solutions which are complex and costly to implement.
  • This overall scarcity limits the available compute power for both training and inference.

So, memory is expensive because inference needs lots of memory.

Shifting Incentives: Training vs. Inference Importance [11:51]

  • Historically, the primary incentive was to dedicate compute resources to training to develop better models and stay competitive.
  • Currently, there's a heightened incentive and importance placed on inference, as it directly drives customer acquisition, market visibility, and revenue generation.
  • Companies must now balance scarce compute and data center resources between training new models and serving existing inference demands.

But of course nowadays there also is a bigger incentive and higher importance here on the inference part because it's the inference part that gives you customers that gives you visibility in the market.

Changed User Behavior: Agentic Workflows [12:40]

  • Since the beginning of the year, customer usage behavior has changed significantly, with "agentic workflows" fundamentally altering compute demands.
  • Long-running, parallel sessions now consume far more resources than initial plan structures were designed for.
  • In contrast to a year ago, when AI usage primarily consisted of occasional chat sessions, current workflows involve continuous interaction, generating hundreds of thousands to millions of tokens rapidly.
  • This shift towards intensive agentic workflows is a primary driver for increased inference demand and higher token consumption.

Agentic workflows have fundamentally changed copilot's compute demands.

The Challenge for Anthropic and OpenAI [15:04]

  • Anthropic may be feeling the economic pressure more acutely than OpenAI due to potentially higher model running costs (based on API pricing) and a historical focus on enterprise/business customers.
  • Enterprise customers are more likely to utilize resource-intensive agentic workflows, leading to higher token consumption and a greater strain on subscription economics.
  • OpenAI, historically more consumer-focused, may still have a larger base of "normie" users who engage in less token-intensive interactions.

Especially for Enthropic, for example, I could imagine they are feeling the pain a bit more than OpenAI.

Future Outlook: Stricter Limits and Higher Prices [16:23]

  • The trend indicates increasingly strict usage limits for AI subscriptions.
  • Subscriptions may reach a point where they no longer offer perceived value, leading to higher prices.
  • Professional and agentic usage subscriptions could cost thousands of dollars per month in the future, potentially being compared against employee costs.
  • New subscription tiers for individual users will likely emerge with much stricter limits, suitable for basic chat but not for agentic workflows.

I think it's pretty obvious. We'll see even stricter limits in the future.

Other People Also See