• pyr0ball@reddthat.com
    link
    fedilink
    arrow-up
    9
    ·
    1 day ago

    The pricing question assumes the current model (cloud inference, centralized compute, hyperscaler margins) is the only model.

    Local inference flips that math entirely. If the model runs on your hardware, the marginal cost to the provider is close to zero. The pricing problem is a distribution problem, not a compute problem.

    What I think actually happens: cloud AI settles at $20-50/month for power users who need the latest frontier models and don’t want to manage hardware. That’s sustainable. The “free tier” disappears or gets severely throttled.

    But for a large chunk of use cases (summarization, classification, drafting, local assistants) models small enough to run on a consumer GPU are already good enough. That market doesn’t need to pay $50/month to Anthropic. It needs a good local runner and a one-time hardware investment.

    The companies that will survive the pricing correction are the ones who either have genuinely differentiated frontier capability, or who make local deployment easy enough that users own their own stack.

    • HobbitFoot @thelemmy.club
      link
      fedilink
      English
      arrow-up
      1
      ·
      18 hours ago

      There are also going to be issues with how bleeding edge AI gets sold. If the AI that can detect security exploits is real, the AI owner isn’t going to sell open access to that model.

      I suspect that, if the AI is really that good for certain tasks, it won’t get sold on a token model but something more akin to human work.