AI's Affordability Crisis

(blog.dshr.org)

245 points by ilreb 10 hours ago|330 comments

•

steveBK123 9 hours ago

I think the biggest problem is not necessarily the cost to develop & serve the models, but how quickly user behavior changed with token based pricing.

I know a lot of people at companies where the marching orders changed on a dime end of Q1/start of Q2. These are shops that were fully on the "use AI or die (because we will fire you)" train.

Now there's monitoring, reporting, alerting not just on overall cost but on "over-use" of best/priciest models based on total-or-percent tokens/dollars, etc. All of this comes with direct developer engagement & standardized management escalation for holding it wrong.

To me this customer behavior does not smell like a product you can 10x the pricing on to get profitable. We have exited the exploration phase and now ROI matters.

•

burningChrome 8 hours ago

I can give you some additional anecdotal evidence to support your comment.

I work at a Fortune 200 company. At first, it was the Wild West. Need an LLM? You got it. Need to or want to build an army of agents? Done and done. We literally had everything at the tips of fingers for about 3 months. Teams were building their own internal tools, the team I work on canceled contracts with several software vendors because teams were building the same tools for what they thought was nothing.

Then they signed contracts with Anthropic and Google because I would assume they saw the token usage was through the roof. One month later? They completely cut off access to everybody for both Claude and Gemini. If you wanted access? Suddenly it was several forms, along with several approvals and a rock solid business case why you needed it. And before you got to the forms? You were added to a waiting list that was thousands of people long.

The entire company is now in damage control after trying to get the genie back in the bottle. I'm guessing someone saw how much we would be paying for the tokens we'd been using and decided to shut the party down so to speak.

•

luke5441 2 hours ago

Maybe it was your company with the 500 million Claude bill.

•

i_idiot 2 hours ago

Were there any managers fired for incompetence?

•

sdesol 8 hours ago

Was there at least performance gains to be measured?

•

burningChrome 7 hours ago

AFAIK nobody was collecting analytics. The one team I was working on had put out a goal of "30% more efficient" using AI tools. Its about as subject as you can get. We never got around to what exactly that meant before everything got shut down.

Myself and several other devs were laughing about the whole thing. The company was so amped about what AI could do they never even bothered collecting any analytics that would affirm or deny any of this had a positive impact. Even some of my team members were talking about the placebo effect AI has had on a lot of C-Suite folks.

•

Quarrelsome 2 hours ago

30% is my new favourite stat. It seems to come up a lot, like the old "this work will take 6 months" that I was equally suspicious of back in the day.

•

stingraycharles 2 hours ago

> Even some of my team members were talking about the placebo effect AI has had on a lot of C-Suite folks.

What do you mean placebo effect? They thought things were created with AI while they actually weren’t?

•

steveBK123 16 minutes ago

For a lot of C-suite there has been more of a fast-follow approach to AI. CEO hears 3rd hand what competitors are doing, tells his CTO to "do AI" and "hire a head of AI".

So a lot of motion (do AI) was created without a destination (product outcome). The motion was what was being measured (we are doing AI).

•

gazebo2 2 hours ago

Probably meaning C-Suite thought productivity was up because of AI, either because A) metrics showed high AI usage or more commits/LOC or B) we're mandating AI usage, why wouldn't it increase?

•

qurren 2 hours ago

What I don't understand is why some companies are so stingy on AI token usage.

If you're paying an engineer $X and they're getting 3x the amount of work done you should be happy paying up to $2X in AI tool usage.

In reality many companies start complaining at their employees when they hit $0.1*X or less.

•

thisoneisreal 2 hours ago

I hesitate to say this because I think the AI hype is generally overdone, but I was contemplating the other day how much I would recommend my employer spend on AI tooling for me based on my salary and it's got to be in the tens of thousands of dollars per year if I'm being honest. I'm a contractor. On my most recent client, I was able to learn a completely new domain and business, design a solution and build a full stack POC (a complicated one with technologies I had never used before) nearly singlehandedly in less than 2 months. That includes all of the onboarding time, getting access to the right people, repositories, databases etc. What made it possible was 1) the AI helped me understand technologies I had never used before, from a 30 year old Java stack to graph databases, 2) it could analyze DDLs and make correct inferences about what tables and columns meant in both business and technical terms, 3) it generated mostly-correct code in multiple languages to achieve what I needed it to do, and 4) it referenced libraries I didn't know existed to accomplish tasks I needed done. I could cycle between these activities and iterate FAR more rapidly than I ever have before. To me this is the thing the AI companies should really be emphasizing. I'm not convinced autonomous agents are really working out, but I'm 1000% sold on their ability to empower developers. And I say this as a pretty skeptical late adopter who only knows the most basic AI tooling.

•

steveBK123 2 hours ago

Most companies are not seeing 2x or 3x of value produced from developers with AI is the reason

•

brandensilva 58 minutes ago

Yeah the cost doesn't justify the value. Spending x3 on tokens is meaningless if they aren't seeing the profit side scale with it. Going faster on product features doesn't mean it translates to more money. In fact I'm not even sure some customers can handle the speed at which a team could move with feature delivery.

So scaling horizontally in different markets seems like an advantage if a product is already mature. Which is exactly what Anthropic and Open AI are doing because they want to put their tentacles in everything.

•

steveBK123 22 minutes ago

> Going faster on product features doesn't mean it translates to more money.

Exactly this. Unless the new features directly drive new users or new revenue from existing users, for many products iterating 2x faster does not mean 2x ROI.

Additionally, from anecdotes here, at work, and in my network.. a lot of the unlocked developer velocity is going to fun/frivolous/extra things. I think part of it is developers have their own features they want for themselves that are the easiest and most direct thing to deploy LLMs against, in the absence of good direction.

Yes it's cool you finally achieved 100% test coverage, or you wrote a new utility that makes your job easier, or cleared the 2 year old ticket that was 100 deep in your backlog, etc. But there were ROI reasons these things were not done previously.

That's already putting aside the fact that development going 3x faster doesn't increase end to end output by 3x because <100% of a SWEs job is development.

•

mkozlows 2 hours ago

10% of a developer's cost is something like $4000/month. Many companies are complaining at a point that's well, well below that.

(I think they are being irrational, and that the mental model they have of AI costs -- "how much are we spending on tooling for this developer?" -- is going to shift over time to something more sensible, but those kinds of short-sighted companies are the ones that are having cost panics.)

•

bluefirebrand 38 minutes ago

> 10% of a developer's cost is something like $4000/month

What company pays developers $40000/month and are they hiring?

•

vjvjvjvjghv 52 minutes ago

Yeah. It’s weird. I just booked 2 flights to different meetings with very questionable value in the next few weeks. The whole travel costs around 3000. I would much rather spend this money on tokens but even 300 is too much per month.

•

Gigachad 2 hours ago

Our company went from “AI AI AI” to “GitHub Copilot has been suspended due to exceeding the budget” with this month’s price increase.

•

mkozlows 2 hours ago

If your company was giving you Copilot, they were never that far on board the AI train anyway.

•

Gigachad 10 minutes ago

Github Copilot is a different product than the Windows / Office one. Horrible naming I know but it's basically just a UI and router to pick whatever model you want.

•

dranudin 8 hours ago

I can second this. Our company and department was all-in on AI. And since the token-based pricing came in, we got an email from IT that tried to explain that most developers don't know how to choose models and that the cheap models should be good enough for most of our work ..

•

verdverm 8 hours ago

Have they built an internal ai enablement team?

•

dranudin 8 hours ago

Yes :D

•

piker 9 hours ago

I.e., the demand for programming tokens turns out to be quite elastic.

•

steveBK123 9 hours ago

I would imagine it only gets worse in the face of good-enough open/chinese/local models too right?

Microsoft adding Deepseek support already as I recall?

That is - for any definition of "they are behind X months" then eventually they get to the point Claude was in January when the world freaked out, but at 1/10th the cost. A lot of firms are going to mandate that is good enough for their developers.

•

michaelchisari 2 hours ago

I'm set up to use Qwen 3.6 locally if needed. It's solid, it does what I need, it runs on my laptop and it's free.

But that's because I never got on the "run three dozen agents in a ralph loop" trend or other high-token usage methods. The way I use AI is discrete and targeted and it seems that's how it will be for everyone once the economics settle.

•

sdesol 9 hours ago

> Microsoft adding Deepseek support

I believe this hasn't been confirmed yet but I think it speaks to a bigger problem for the AI companies which is, if you give capable developers a good reasoning LLM, they can make it work like it was a really expensive model.

I believe we are 100% at the stage of good enough for the vast majority of tech companines. Fable and others will be more valuable for non-traditional tech companies.

I read somewhere that the chinese AI companies are sharing knowledge and it would not surprise me if the government is applying pressure by saying work together or else. If they work together, they can truly commoditize LLMs and with China ramping up hardware support for AI, I see the future being inference speed and hardware being the moat.

•

thewebguyd 8 hours ago

If hardware becomes the moat, the US frontier labs are screwed. We have AWS, Azure, GCP. All three have or are making inference silicon. LLMs become just another service in the public cloud's large service catalog, and open weight wins.

Which makes sense to me. Selling a chatbot interface/model access to the general public was never going to be a viable long term play. You still need developers to wrap the models into specialized tools. Queue the Jobs quote "It's a feature, not a product."

•

KolibriFly 6 hours ago

The funniest thing would be if in a couple years LLMs just end up being another checkbox next to PostgreSQL and Kubernetes

•

thewebguyd 4 hours ago

I don't think that's far fetched at all either and is probably the end game ultimately. No one wants to buy a chatbot, they want to automate something with it. Intelligence is just another PaaS offering right next to storage, compute.

The only hiccup in that happening is will the US Gov let Anthropic and/or OpenAI fail when that time comes.

•

sdesol 8 hours ago

The big thing is, the western world has moved so much of the manufacturing to China and think a lot of people will not forgive Samsung and others, so I can see China owning a good portion of the supply chain.

•

bluGill 26 minutes ago

While China makes a lot and a lot was moved, don't let that fool you. The western world is still making a lot of things. Manufacturing is bigger than ever in the US by any useful measure of manufacturing except the measure of number of people working in that field. Very few people work in a factory anymore because automation has replaced most of them. A factory that had 2000 employees in 1950 should have under 200 today to make the same things. If it can be that automated it moved to China.

•

johnvanommen 8 hours ago

> The big thing is, the western world has moved so much of the manufacturing to China

I built my career on Solaris and it got rugpulled by Linux.

That wasn’t because of software, it was because of hardware. Linux’s cost advantage existed because Sun hardware had huge margins, because their software was basically free.

AI will probably be a repeat of this. Whoever can come up with the hardware solution that minimizes the cost per token will win.

I believe the 5090 still holds this crown, but someone certainly knows better than I do.

•

rescbr 7 hours ago

While people fly to the US to buy Macs at a lower price and bring them back in their backpacks, I guess I'll be flying to HK to buy a Chinese GPU rather sooner than later...

•

trollbridge 6 hours ago

Fortunately, Solaris skills map to Linux pretty cleanly.

•

fragmede 7 hours ago

but not all tokens are equal and vertical integration is the name of the game. Solaris did not lose to Linux, it lost to the LAMP stack on commodity x86 hardware. without the "AMP" part, Linux would've been dead in the water.

•

fluoridation 47 minutes ago

The L part of LAMP is kinda irrelevant at the application level, though. From the point of view of running the software, SAMP or LAMP doesn't make any difference. So yes, Solaris lost the competition with Linux to be the first letter of the acronym.

•

CuriouslyC 9 hours ago

100%. There will be strict quotas on the expensive models and day to day work will be done on the cheap models that are "good enough" with escalation to the metered models when the cheaper options are spinning their wheels. Eventually the US frontier lab APIs will only get the most heavily triaged work that multiple tiers of cheaper Chinese open weight models have failed on.

And of course the C-suite will have unlimited access to Mythos tier models, which they'll use to summarize reports, while passing down mandates to rank and file to increase usage of less expensive models.

•

Gigachad 2 hours ago

Our org straight up turned off our AI access after the GitHub copilot price increase blew right through the budget.

At least on a personal I feel like I’ve been getting the same amount of work done but I have to think harder rather than sitting back and prompting and waiting.

•

verdverm 8 hours ago

Yup, we are in the process of getting access to US hosted Chinese models. I've been petitioning Google and our rep, we will see but I suspect they will cave eventually. Gemini sucks and if they don't sell what their customers want, we go shopping around.

•

bloppe 3 hours ago

What you want is already available on OpenRouter and a million other services, but sure, you can wait 18 months for it to be on GCP.

•

verdverm 3 hours ago

> We are in the process...

OpenRouter charges an extra 5.5%, Fireworks does not, Google is separate, but I doubt it will take 18 months. They are already aware they are losing business.

OpenRouter is the wrong abstraction for enterprise, we only need one model provider, not everyone in the world. Nor do we want to have to worry about failover going to providers we don't want.

•

jayd16 9 hours ago

If folks won't pay a higher price, doesn't that mean it's inelastic?

•

unholiness 9 hours ago

"Elastic" in economics happens to refers to how elastic the supply/demand is when the price changes (not vice versa, as you're describing). So e.g. an inelastic demand means the quantity demanded changes very little when the price doubles.

•

steveBK123 9 hours ago

Elastic demand means buyers are highly sensitive; a price hike causes a massive drop in purchases. Inelastic demand means buyers aren’t very sensitive; they keep buying regardless of price

•

jayd16 9 hours ago

Ah alright I have it backwards then.

•

ofjcihen 7 hours ago

I do a lot of client work for fortune 100’s.

Over the last month I have seen companies scrambling to measure deliverables against cost. Most of the back room talk is to the affect of giving devs a small allowance ($500 a month) and then making them prove their own productivity increases (again, based on deliverables, not LoC) before they either take it away or give them more.

Obviously this won’t be on an individual basis but some kind of unit.

Either way, with how much I see these companies cutting back I have no idea how the big AI companies are going to be profitable.

•

energy123 2 hours ago

The AI companies will be profitable if they ship the goods. Make AGI and companies will pay for it.

•

fluoridation 55 minutes ago

Not if it costs more than a human, or if it performs worse (be that in quality or in time). After all, humans have NGI and companies don't just hire anyone with a pulse.

•

matheusmoreira 53 minutes ago

This is great to see. LLMs are great but the industry really needs to experience a correction.

•

whimsicalism 2 hours ago

I think this panic around overuse is largely just managers talking amongst themselves and everyone trying to see what everyone else is doing. It is frankly hard to interpret in any other way

•

bluGill 24 minutes ago

Also talking to accountants who will sound alerts if the bottom line isn't looking good enough.

•

woeirua 7 hours ago

It's not an affordability crisis, it's a financial crisis. The models get cheaper super fast. By this time next year Fable 5 will cost less than Sonnet does today. That's not the problem. The problem is that many companies are going to realize that they don't get any ROI from AI. Generating code faster != more profit. Most of the Fortune 500 will likely realize this and then the token budgets will come crashing down. Most of their ideas are _bad_ ideas. Implementing bad ideas faster, won't lead to more profit.

Sure, you can use AI to potentially replace software engineers, but the F500 are also terrified of not having accountability or making mistakes. They won't be firing any engineers. In that scenario, there's just no room for AI usage. If you have to be responsible for all the code, then... AI has to either manage it completely autonomously (which even Fable can't) or... humans have to be in the loop which means they still have to understand the code. The best way to understand the code is to write the code yourself. So there's no productivity gain to be had.

I'm pro-AI, but I think we're due for a big crash next year.

•

Supermancho 4 hours ago

> The models get cheaper super fast. By this time next year Fable 5 will cost less than Sonnet does today.

I'm not sure that's something to rely on. I would be Fable 5 will be phased out and the bleeding edge will be priced up.

•

jerf 34 minutes ago

I don't know about everyone else, but if you told me that for the next couple of years I, and everyone else, would have only Opus 4.5, I wouldn't exactly cry about it. Especially if in the meantime it got cheaper and maybe a bit faster.

My desire for the latest and greatest continues on, but my need for it in order to get any value out of it at all is much, much smaller. The in-practice delta between all the versions since 4.5 have been much more subtle than, say, the models available a year before Opus 4.5 and Opus 4.5.

The bleeding edge is going to have to earn its price delta. They can't count on me wanting to upgrade just to get something halfway decent anymore.

•

stingraycharles 2 hours ago

I’m not entirely sure, it’s my understanding that later this year a lot of compute will go online with the latest hardware that will be significantly cheaper for inference.

The problem is rather, I think, that people always want to use the latest and greatest models. And that training is super expensive.

Potentially we’ll just see less new model releases.

•

whimsicalism 2 hours ago

Do they get ROI from additional workers? Honestly it is difficult for me to imagine white collar companies that have positive marginal profit of labor but not positive marginal profit of tokens, especially in the future.

•

zdragnar 2 hours ago

Considering the massive layoffs and cold market post COVID, the answer is "no" for a lot of companies.

Theoretically, there's a lot of room for marginal work where developer time isn't worth the cost for the output but tokens are cheap enough to make it worthwhile. Very little of that work ends up being customer facing though, so it isn't actually a growth opportunity for the company.

•

KolibriFly 6 hours ago

I feel like this is way too binary. I don't have to write every line of code myself to understand the system. I don't write my own compiler HTTP stack or database either

It's more about the level of abstraction. If AI handles 80% of the grunt work and I spend my time on architecture and reviews that's still a win

•

asdff 5 hours ago

This works for you because you were trained in The Old Way.

Consider the people younger than you. Who are literally shutting their brain off so AI can cheat on their essays and exams. They aren't going to be good architects or code reviewers.

•

vjvjvjvjghv 45 minutes ago

That’s a problem. I work quite a bit with AI now by if I didn’t have a ton of experience writing code myself things would go off the rails quickly. I often have to steer the AI in a different direction from the initial plan it comes up with.

•

sevenzero 2 hours ago

>The models get cheaper super fast.

Weird, why didn't my subscriptions decrease in price then? Oh wait..

•

ranyume 2 hours ago

What do you mean? My deepseek subscription got really cheap.

•

simianwords 4 hours ago

Its very interesting how you are contradicting the whole article's axioms and then arriving at the same conclusion that we are in for a crash!

Rational takeaway is to step back and analyse what's really happening here.

- Are we really in for a crash?

- What does it say about the culture and people's mental models that we have two radically opposing viewpoints on AI costs and people still arrive at same conclusion?

•

fluoridation 27 minutes ago

The fact is that a lot of people think AI is overvalued to the point there's no way it can deliver on the hype, so (if they're right) there necessarily must come a crash that brings sentiment down to the same level with objective reality. There's a lot of hype surrounding AI, so there's many different ways to defend the idea that it's overhyped, the same way that if you ask ten different people what is most wrong with the country, you'll get ten different answers.

>- Are we really in for a crash?

The question you should really be asking is, is AI really overvalued, or is it so useful it justifies all the hype that surrounds it? If the former, then yes, a crash is inevitable, because we don't live in the land of make-believe. If a crash never happens then AI was not overvalued, it was valued appropriately.

•

whimsicalism 2 hours ago

It's just schadenfreude/ressentiment. Our societal values are very christian even as we secularize and you can see this reflected everywhere you turn. Same thing with people feeling there ought to be a 'catch' to GLP1s, etc. etc.

•

simianwords 2 hours ago

I don't get you because I'm too stupid to parse your thoughts

•

whimsicalism 55 minutes ago

thanks for the productive reply

•

simianwords 50 minutes ago

I thought you would explain what you meant but I asked Claude and I kinda understood. Interesting analysis.

•

827a 9 hours ago

> Zitron's numbers don't tell us the real cost of generating tokens but, subject to the assumption that the platforms are not subsidizing the token price, that means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times

Neither Anthropic nor OpenAI are subsidizing enterprise customers. Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage). The $100/$200/mo plans are for individuals only (of course, many individuals use these plans at work, but that's beside the point; they aren't selling this plan to enterprises).

> SemiAnalysis also analyzed the platform's gross margins, implausibly assuming that tokens were priced at 4 times the cost of generating them and: With the current subsidies, all it takes for a user to have a gross margin of at best negative 25% is for them to use as little as 25% of their rate limit.

The article's source for this claim is not SemiAnalysis; its Zitron. But once you dig through his article, Zitron links to a SemiAnalysis tweet [1] where they, as the paragraph states, implausibly assume gross margins of 75% to come up with their weird analysis of the subscription plans. Citing this for anything is weird, because afaik that 75% number is a total shot in the dark. We have no clue what their margins are. My take is that the only reason that 75% number is implausible is because it may underestimate the inference margins of Ant/OAI's API pricing.

[1] https://x.com/SemiAnalysis_/status/2064815045767213400?ref=w...

•

bayarearefugee 9 hours ago

> it may underestimate the inference margins of Ant/OAI's API pricing.

If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share when both are clearly doing all sorts of political and PR maneuvering to compete in a cutthroat market?

Since they aren't dropping the API usage prices (and are in fact raising them in a lot of subtle ways) then one of these options almost has to be true: they are still subsidizing inference, training costs are so ridiculously high that they need to make huge profits off inference or collapse in on themselves, or they are price fixing.

•

CuriouslyC 8 hours ago

The training costs are very likely the reason. Dario has talked about how each individual model is profitable, but how the expenditure training the next generation of models makes it look like they're not profitable at any given moment in time, and I believe he's being honest about that.

The market for open weight model hosting gives you an idea of the profitable price floor, it's pretty clear there's markup baked into OAI/Anthropic's APIs.

•

simonw 3 hours ago

> If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share

Maybe because they're trying to IPO this year, and their IPO prospects will be a lot worse if their S-1s show them to be losing money on inference as opposed to making a healthy profit.

•

827a 5 hours ago

Company-wide their margins are trash (probably negative). They need as much inference margin as they can get to afford the massive training runs. It is likely that we'll see GPT-5.6 reduce API pricing to compete against Anthropic, but whether Anthropic feels they need to reduce their prices is anyone's guess.

•

orangecat 8 hours ago

If true then why are neither Anthropic or OpenAI dropping their API pricing

They are? In the before times of 2025, Opus 4.1 was $75 per million tokens. Opus 4.8 is $25, and Fable is/was $50.

•

wqaatwt 4 hours ago

Why would they? If they see the market as a duopoly for now and don’t consider open Chinese models a fully credible threat that might start eating into their share then they have the incentive to charge as much as the market can bear instead of under cutting each other in a pointless price war.

•

matheusmoreira 42 minutes ago

Nobody cares about their training costs. Their collapse is the optimal outcome for humanity. Oligarchs blowing trillions on a godlike AI, only for the model to leak so that everyone with the hardware can use it, is literally the best case scenario.

•

minraws 8 hours ago

Given my experience with hosting these models at scale, working and optimizing load, I don't think the margins are nearly as high as 75% if the models are as big as people often claim.

Only reason deepseek is so cheap is because well I don't know, but actual pricing should be around their initial price which was 4x, at that price you have a healthy 25-50% margin based on occupancy, given the deepseek v4 is a very sparse moe model.

GLM 5.2 for example doesn't have more than 30-50% margins that's assuming old pricing for GPUs, current inflated GPU pricing well I am certain the margins must be lower. Ofc you can host for cheaper with quantization, and if you have very consistent capacity/utilization, which is not the norm with AI workloads.

Overall for large models like GPT 5.5 or Opus there must be healthier margins of around 50-70% assuming GPU pricing didn't increase for these companies. Even if it did 30-40% margin should be possible, even in worst case assuming all GPU they had saw a jump in pricing.

For smaller models it's hard to say, I would guess 20% but these models might be much smaller than I suspect, then it might be double that.

Note the issue is less intelligent tokens don't linearly scale down in memory usage, which is the biggest pain point of serving models. Context sizes have fucked us all.

Also anyone claiming OAI makes less margins on APIs or stuff might be wrong given they are on much lower context size, 1M context definitely is a lot more expensive to serve especially with smaller models like sonnet.

•

andrekandre 7 hours ago

  > Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan.

they may not "allow" it, but i've seen first hand enterprises encourage employees to use these accounts personally and get reimbursed later to avoid pay-as-you-go w/limits pricing for users who do tokenmaxing as a cost control measure...

•

827a 2 hours ago

Yeah I just mean that if a business came to them and asked for fifty licenses to the $200/mo plan, OpenAI would tell them to kick dirt and basically pay API pricing. Startups should 100% just be telling their employees they can expense up-to $whatever/mo in AI-related expenses, and let software engineers go buy personal Codex/Claude subscriptions.

•

bluGill 18 minutes ago

Many large companies won't allow employees to expense such things personally. It is too easy to allow other fraud if that is allowed.

Though large companies will demand limits if API or whatever they can get is too expensive.

•

Analemma_ 2 hours ago

Do these companies know they're letting their source code (and internal documentation, chats, communications, etc.) get used as training data? On the flat-rate consumer subscriptions, everything you send is fair game to train on, which is not true of the enterprise plans.

Shows how much they value their IP, I guess.

•

andrekandre 31 minutes ago

its completely ironic but outside of claude ignore/harness rules, i've seen this stuff go out the window where ai is concerned or just brush it under the rug

•

surgical_fire 5 hours ago

> Neither Anthropic nor OpenAI are subsidizing enterprise customers

> Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage).

I actually think that even the API pricing of OpenAI and Anthropic are still subsidized. I don't think they make any profit on inference when you factor in depreciation. They likely still operate that at a loss.

It's no coincidence that Anthropic only had a "profitable" EBITDA with not paying Elon for compute for a bit of time, and when EBITDA curiously ignores depreciation. Models grow stale over time, as knowledge is not static.

•

HDThoreaun 5 hours ago

I don’t see any reason to believe this. If you compare their api prices to open router you see they charge 10x as much. Sure their models are probably bigger, but they have economy of scale on their side, and I doubt their models are 10x bigger.

•

surgical_fire 5 hours ago

It's impossible to tell at this point. I have no idea how much of their compute is also subsidized through deals they have with the hyperscalers, etc.

It's irrelevant how big their models may or may not be. Depreciation needs to be taken into account, so does actual compute expenses. Training those models is not cheap, and you will never reach a point where a model is "final". You will always need to train the next one.

Eventually the bill has to be paid. Money and resources are finite still.

•

HDThoreaun 4 hours ago

Well the third party operators on open router are assuredly operating at a profit, including depreciation. The only reason they’d be profitable at 1/10 the price of the labs while the labs aren’t profitable is if inference costs the labs 10x as much per token

•

surgical_fire 2 hours ago

> Well the third party operators on open router are assuredly operating at a profit, including depreciation

They don't train new models.

They have to depreciate their GPUs, which I hope they do.

•

HDThoreaun 2 hours ago

I agree inference probably isn’t profitable yet if you include training costs. My claim is that marginal revenue from inference is higher than marginal cost. If that’s the case then if they scale enough the training costs will be amortized.

I realize I said assuredly when I meant assumedly. My mistake. I agree it’s possible that the third party open model hosters aren’t actually profitable, my claim does rest on the opposite.

•

jwrallie 11 minutes ago

Not a heavy user but I got a feeling for this early since I have a Copilot subscription I get for free as an educator.

I used in a day or two the limit that would last me a month. Downgrading from Sonnet to Gemini Flash was the only way to keep the limit longer, and who knows when cheaper models will be discontinued for something more expensive.

I don’t know if the prices will remain low, but at least Chinese models being open make them have no control over when it is discontinued, I think learning to work with open models is a good direction, even if not running it on your own hardware.

•

androiddrew 2 hours ago

The over investment by VC means that yeah, they are offering all of this below market rate. It's like Enron where they have to keep the scheme going, and dumping on retail investors is the only thing they can do now.

So we are going to go through a big IPO period. Everything will fall apart because VCs already extracted the growth value, and that will show up after the bag has been passed. Things will implode. What survives afterwards is what we will have.

•

whimsicalism 2 hours ago

When we say below market rate, what do we mean? The token economics are definitely such that they are charging more than it costs to serve these models with reasonable assumptions on param/activated size.

•

segmondy 27 minutes ago

The really crisis is that we are now in a phase of ("AGI" or Bust). That's it. Unless we get a self improving autonomous AI heading towards "singularity" then it's all going to blow up. The amount of money invested is ridiculous and would need a long time to recoup. Without AGI, plenty investors are going to start exiting their positions and the entire thing will be drained.

•

tacone 9 hours ago

My take is that Anthropic and OpenAI simply are NOT competing on price. 2 big players are often not enough to create tension on price.

Chinese models and open model providers are, indeed, competing on price, and the difference shows.

•

gizmo686 8 hours ago

1 player is enough to create tension on price when "don't buy it at all" is a comptetative option. By most accounts, Anthropic and OpenAI both lose to "just don't buy" when they try charging at cost.

•

bluGill 16 minutes ago

Nobody has seen the financials who is talking. We have various rumors of costs but no real reason to believe any of them.

•

rhinoceraptor 9 hours ago

How are Anthropic and OpenAI going to compete on price when they're both already deeply unprofitable?

•

solidasparagus 8 hours ago

Serving the API is profitable. They are unprofitable because of R&D (and maybe subscription costs?). If they can continue to find access to R&D capital, there is space to reduce API costs.

•

dns_snek 8 hours ago

Nuclear energy is really cheap too... as long as you ignore CapEx, would you like to invest?

•

HDThoreaun 5 hours ago

Marginal cost of nuclear is huge. Marginal cost of inference is much smaller. Capex in nuclear isn’t a fixed cost, it is the marginal cost.

•

dominotw 8 hours ago

how do you have access to their financials? are you an insider?

Edit: to the commenter below . It was widely reported that these companies were unprofitable 1 from last year. I am asking question to this specefic comment because they made a very specific claim about part of plan thats profitable . something only an insider would know.

1. https://www.wsj.com/tech/ai/openai-anthropic-profitability-e...

•

mh- 8 hours ago

I'm curious why you didn't pose this question to the grandparent commenter, who first asserted the opposite?

•

zyuiop 6 hours ago

The amount of capital they need to raise, despite the claimed revenue, indicates that they spend more than they gain, which is by definition unprofitable.

•

brainwad 8 hours ago

Anthropic just announced it's on track to have its first profitable quarter: https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-...

•

dns_snek 8 hours ago

Response: https://www.wheresyoured.at/anthropics-profitability-swindle...

•

SpicyLemonZest 9 hours ago

They may not be able to! It's pretty widely acknowledged, for example, that if there's some surprising plateau hiding around the corner they're both going to fail. But that could mean that they're overcharging for AI usage to get research money and sustainable rates are lower rather than higher.

•

guax 8 hours ago

I think that for coding we're past the plateau issue. The frontier models of today are good enough and very valuable. The expensiveness in running them will eventually be solved by cheaper faster hardware.

I do hope that a day will come where you can buy the nvidia spark thingy for 5k that can run the equivalent of Opus 4.6 or 4.5 locally and that would be a massive thing.

•

johnvanommen 7 hours ago

> The expensiveness in running them will eventually be solved by cheaper faster hardware.

How?

* Moores Law is almost over. The 5090 improves over the 4090 mostly because of quant improvements.

* even if the hardware improves, there’s a huge incentive to slow roll the next generation. Nobody wants to end up like Sun Microsystems. Sun’s used hardware was faster than its new hardware, once you considered price. Sun ended up competing with its own used equipment.

The most obvious place for improvement is RAM, network and storage.

If someone can bring more RAM onto the market, that will unstick things.

•

Legend2440 7 hours ago

GPUs are not really the ideal architecture for running neural networks; they are heavily bottlenecked by memory bandwidth and struggle to keep all their tensor cores supplied with data.

There is significant room to make more specialized neural network accelerators with new compute-in-memory architectures.

If the brain can run 86 billion neurons on 30W it must be possible.

•

akomtu 3 hours ago

Our brains run 86 billion neurons the same way a waterfall runs a fluid simulation with N quadrillion particles.

•

VorpalWay 2 hours ago

There are already some companies doing specialised inference hardware, Cerebras Systems for example. Such designs are still early days and I wouldn't be surprised to see more innovation there. Though because custom silicon design takes time I expect a multi-year cycle.

For training, not sure. But even if training runs on GPUs, once you have the model the main cost is inference.

•

CuriouslyC 8 hours ago

The whole hidden plateau hypothesis is kinda bunk, because we're already pretty far in a plateau for general knowledge/question answering, but there are many subdomains where we can push model capabilities, and as we saturate one subdomain we can just shift to another economically valuable one.

There isn't one AI intelligence S curve, there are thousands of them, and they're mostly invisible in the major benchmarks, but for someone trying to do work in that specific area of capability, the progress is transformative.

•

SpicyLemonZest 8 hours ago

I'm skeptical of a hidden plateau, but I really think it's overconfident to assume there's not one. Remember that it doesn't even have to be a technical plateau; the effective plateau of e.g. car speeds is determined by regulations and road conditions, and far below what "frontier cars" are capable of on a controlled racetrack.

•

wonnage 7 hours ago

That’s the scenario where we’ll all be using Chinese models

•

intrasight 8 hours ago

There is no moat until a company achieves RSI and/or AGI, and the one that does succeed in moat-making will do so by hacking into and destroying their competitor's infrastructure.

Once moat is achieved, you don't have to compete on price. Of course it'll be academic because the AI will probably destroy all of us.

•

lenkite 8 hours ago

Chinese models are dropping in price thanks to ridiculous levels of state subsidy where companies are forced into aggressive price wars to survive and grab market share. I am guessing this will also blow up sometime next year or in 2029 at the maximum.

Btw, some Chinese corporates have already seen this and increased their price. Zhipu AI & Tencent for example. Alibaba, Baidu, and Tencent also announced multiple price increases for their AI services.

•

SwellJoe 8 hours ago

China has the benefit of vast solar power and rapidly increasing battery capacity. Yes, that's subsidized, but it pays for itself in the long run.

And, even with the price increases, Z.ai and Tencent are still much cheaper than Anthropic or OpenAI models. I think there's an efficiency focus among the Chinese models that is absent at OpenAI and Anthropic, and in the end I suspect efficiency will be the winning feature. Google seems to understand that. Gemini 3.5 Flash is pretty competitive with the big guys, and it's small enough for Google to run it profitably (I assume) for a price that's much less than the frontier models. Gemma 4 models are showing off a bunch of efficiency techniques (MTP, QAT, the 12B encoder-less vision model that soundly outperforms much larger vision models, DiffusionGemma), and I assume they have several more techniques that aren't published.

•

LPisGood 8 hours ago

This is in contrast to American models which receive _ridiculous_ levels of private subsidy.

•

wqaatwt 8 hours ago

Chinese companies like Deepseek are operating on shoestring budgets (allegedly less than 300 employees at Chinese wages). It’s not that self evident there is anything that needs subsidized besides compute (due to limited manufacturing capacity and access to Western chips in China)

•

fny 9 hours ago

The unit economics might be just fine. We'll know more after IPO.

The drug dealer analogy has a darker side to it, however.

Once your dependent, they can drive up the price just because. It doesn't need to be for existential reasons.

•

onion2k 9 hours ago

Once your dependent, they can drive up the price just because. It doesn't need to be for existential reasons.

This is the crisis point for vibe-coders. A developer can go back to writing code by hand, as horrible as that might sound. Someone who hasn't learned to code but builds with AI can't go back. They either pay or they stop. That will be an painful choice whichever way you fall.

•

jcfrei 9 hours ago

There are already open weight models out there that are capable and cheap enough for a lot of coding tasks. Not as good as Claude but not far from it. There's no going back to pre-AI coding.

•

SpicyLemonZest 8 hours ago

I can't speak for everyone, but for most of my coding tasks, Claude is just barely good enough. There's no going all the way back, and perhaps open weight models will keep improving, but at least 50% of my work would be better done by hand than by a worse-than-Claude agent.

•

SwellJoe 8 hours ago

I consider Opus 4.5 the crossover point where coding with agents got more efficient than not coding with agents. They were too stupid before that, and wasted more time than they saved for anything beyond a basic CRUD app or HTML page.

Certainly, the best models have gotten better since then, but I wouldn't consider DeepSeek V4 Pro or GLM 5.2 to be a big enough downgrade to be worse than coding by hand. I'm willing to spend a premium for the best model for coding because it wastes less of my time with dumb stuff, so I've got a Claude subscription. But, there is a limit to how much of a premium I'll pay. 10x over Chinese models? OK, fine. Opus saves me enough time to make it worth a couple hundred bucks a month. But, 100x, or more? Nah. I'll go a little slower, review the PRs a little more carefully.

And, open weights models do keep improving. DeepSeek V4 Pro is a notable improvement over earlier DeepSeek models, and the first DeepSeek model to cross the "better to work with it than without it" threshold into Opus 4.5 (or better) territory. GLM 5.2 is somewhere in the ballpark of Opus 4.6 (though without vision, a notable limitation for anything that requires a UI).

•

energy123 2 hours ago

Using one agent at a time still won't be expensive though. Price hike will kill the agent swarm stuff.

•

jcgrillo 9 hours ago

There's a secret third option: learn. At one point, all of us were "nontechnical", but we learned. The trick is to never stop.

•

akazantsev 8 hours ago

Is it? Learning is one thing. But owning a large codebase, you see for the first time, is a completely different level.

•

VorpalWay 2 hours ago

When I started working on a large legacy C++ code base in the early 2010s, I learnt. It took probably close to a year until I was proficient enough that I didn't regularly need to ask where to fins things. This is a skill we used to have (and some of us still have).

Though if your code base is all a vibe coded mess and you don't have a senior human colleague to ask... Good luck?

•

jcgrillo 7 hours ago

Yeah giving up is totally a viable choice, but it isn't the only option.

•

dofm 9 hours ago

All of the silent, hidden model routing OpenAI does strongly suggests that the unit economics are not just fine, at least not yet.

If apparently the only way you can make money with your product this early is to dilute and adulterate it behind the scenes, it strongly suggests you want the customer to continue to believe they are getting value that you can't afford to supply.

More prosaically: if either of these firms could prove that they were even really close to profitable on inference, they would have bloomin' said so while they were trying to raise more money.

•

JimsonYang 9 hours ago

The dependent idea is questionable- when your boss tells you to not use the most expesive models-you just dont

I would assume when price hikes happen either 1) less non technical people would vibecode as it doesnt impact the work that much 2) people use the cheaper chinese models 3)we're jamming ai into everything because were exploring. We will just niche down into use cases that provide high roi

•

okr 9 hours ago

AI is a worker for me. That i pay for. Basically i am in the same game now to reduce the prizes i have to pay for my workers. Just like the employers are, that seek to reduce costs for employees, as we are simply too expensive. We need more competition among the workers. Let's introduce more chinese workforce! ;)

•

nemomarx 8 hours ago

If you had a choice of maybe 3-4 contracting firms to hire workers from and you weren't large enough to negotiate on price I think you'd be in a pet bad spot as a business?

•

okr 7 hours ago

I would say so, yep. I just find it funny, that suddenly i am in the position to find for the cheapest option for my lovely AI workers. While usually me is the one who complains to be underpaid. I am in the same shoe as my employers now.

•

chrismarlow9 9 hours ago

I'm finding it challenging to believe they wouldn't just cannibalize anything dependent on them in that way or at minimum launch a directly competing product.

•

airstrike 9 hours ago

It's a really different market, though. New entrants can easily undercut them if they price too high

•

chermi 8 hours ago

Lol I feel like no one has any attention span here. Tech shit is expensive in the beginning when it's new. It gets cheaper with time. This is a tech forum, don't we know this? Of course people overreact in both directions on both sides of the issue. It's a very fast technology, wait for things to settle before making grand declarations.

•

dualvariable 8 hours ago

Yeah, but in the short-term there's $600B/yr of debt-financed depreciating capital investments waiting to financially blow up.

If you zoom out to the year 2100, it becomes a little pimple on the economy that is ready to pop, but in the here and now it can cause a lot of damage to real people's wages and finances over the next 3 years.

•

akazantsev 8 hours ago

> Lol I feel like no one has any attention span here. Tech shit is expensive in the beginning when it's new. It gets cheaper with time.

The funniest comment here. Have you seen the prices of the technical shit for the past two years? Dang, GPUs are not getting any cheaper, but more expensive with each year.

•

dwaltrip 2 hours ago

It’s a massive supply crunch. More production will come online.

•

VorpalWay 2 hours ago

Probably not, at least for DRAM. The demand has historically been very variable, and building production capacity takes multiple years. Also, spare capacity is really expensive. Thus the memory manufacturers don't want to expand, betting on it being yet another temporary bubble.

Also, DRAM fabs are not really usable to make compute (CPU, GPU, etc in this context) silicon. The production lines and tech have diverged some decades ago. So unlike TSMC which can relatively easily retool for another customer, no such luck for the big DRAM manufacturers.

•

mannanj 2 hours ago

How did Apple figure this out? Isn't the solution to this to sort of evolve to a unified memory architecture, which wins in speed and cost anyway?

•

VorpalWay 53 minutes ago

I haven't looked at apple specifically, but generally the approach taken with HBM on GPUs etc is to make multiple chiplets and connect them together very close and with very high bandwidth. The chiplets are made with different processes, optimised for each specific use case.

There are a few different variation, such as having a carrier chip below (as an interconnect), having a small PCB that everything is mounted to, or even vertically stacking chips (AMD's 3D v-cache does this on some of their CPU models).

•

SwellJoe 8 hours ago

That's an artificially inflated market. OpenAI and xAI bought everything for like two years into the future, partly to inflate the AI bubble, partly to lock-in a monopoly on the kinds of compute you need for AI, and partly to scale up actual operations. They can't realistically keep buying all the RAM in the world forever, the money has to run out eventually (though the market can remain irrational for quite a long time and can keep giving OpenAI and Apartheid Clyde money well past the point of reason).

•

anthonypasq 7 hours ago

brother its been 1 year since claude code released. how fast are you expecting these things to happen? the physical world and hardware are still constraints. someone has to dig shit out of the ground to build these things.

•

nemomarx 8 hours ago

Lots of stuff in the zirp era was cheap when it was new and increased in price over time though. Look at grubhub fees or etc.

•

knuckleheads 9 hours ago

Shouldn't we know a better answer to these questions once Anthropic's IPO materials surface publicly? I understand, and maybe even expect, SpaceX's materials to be all over the place and skate on by any discussion of unit economics, but the nerds over at Anthropic might just be forthright enough to just tell us what their margin is on tokens as part of their IPO.

•

rich_sasha 8 hours ago

To be honest, making sense of finances of fully public companies is often hard, because in practice, accounting is hard. How you account for depreciacion, cost, investment, fixed vs marginal costs is in practice fluid, companies have an incentive to make it look attractive, while also optimising for tax and shifting revenue around to narrowly beat analyst recommendations.

Here's a concrete example. Does some random AI company make operating profit on inference? I.e. if you only kept marginal costs, would you make a profit?

Well, depends what you account as your costs. If you're using hand-me-down hardware from previous generation's training, how much do you charge yourself internally for it? Maybe you show less, so investors take solace in profitable inference, even if you're losing money overall. How exactly are you accounting for electricity costs between training and inference? Is your army of SREs mostly servicing training new models (R&D expenditure) or inference (operating cost)?

This even has a name, and is called the "big bath" approach. If investors expect one part of your business to be a fiscal black hole, just shove all your costs there. They are accepting of it, and you make the rest of the business look better.

I'm not accusing AI companies of cooking the books, rather I'm trying to highlight you could see all the cash flows and still not know how much money is made or lost where.

•

verdverm 8 hours ago

I saw some commentary that their free cash flow is misleading because it doesn't subtract the stock compensation they are paying to attract / keep top AI talent. Their point was also that deciphering financial statements is hard

•

brainwad 8 hours ago

Why would it? Stock compensation doesn't affect cash flow, it just dilutes the shareholders.

•

verdverm 7 hours ago

Except that's the thing, they do stock buybacks so they do not dilute existing shareholders or lower stock prices.

This is the video I watched that explained the shenanigans (from the guests' perspective, not illegal, obfuscated)

https://www.youtube.com/watch?v=YrJzjC4kKCY

•

wqaatwt 2 hours ago

Yeah but that’s very standard and pretty much all pre-profit tech companies do something like that when/if they can

•

verdverm 2 hours ago

sure, the guests say as much, their point is that it is hard to determine their real free cash flow. They estimate Meta is 80% lower and Alphabet is 2/3 lower. They give more details and quarterly perspectives

•

steveBK123 9 hours ago

Well it probably doesn't help that Dario is going around on podcasts saying things like "frontier labs need $1T of revenue or they will go bankrupt" lol.

•

jimbokun 9 hours ago

Dario’s company may be creating super intelligence that will kill us all in the near future, but at least he seems to be brutally honest about all of it.

•

manapause 9 hours ago

The irony in AI triggering societal collapse due to gross economic malfeasance is just fun to think about.

If AI was around in the early 2000s Countrywide.ai would have been a thing.

•

wongarsu 9 hours ago

Which is just a flashy way to say "we have low margins and lots of overhead".

Considering how much they spend on sales, marketing and R&D that doesn't sound that absurd

•

steveBK123 8 hours ago

My point is that $1T of revenue is A LOT. Apple & Google each only did $400B revenue in 2025. Facebook did $200B. Think of how many decades it took the 3 to get there.

So depending on how literally we interpret Darios comment, OpenAI & Anthropic need to get to Apple+Google+Meta revenue numbers in like single digit years?

•

qnleigh 7 hours ago

The estimate that AI companies need to replace 27% of jobs to service their debt is interesting. But at least Anthropic and Meta seem to have their eyes on replacing software engineers.

There are ~1.6M software engineers on the US [0], earning a bit under 150k/year on average [1]. If AI companies captured all of that spend, that amounts to about 250B/year. The article assumed that they need around 300B/year to keep up with their debt.

At least based on Meta's recent behavior, forcing 30-50% of developers to switch to data labeling, it looks like that is actually their game plan.

[0] https://en.wikipedia.org/wiki/Software_engineering_demograph...

[1] https://www.indeed.com/career/software-engineer/salaries

•

whimsicalism 2 hours ago

obviating software engineers is effectively AGI-complete and entails obviating most labor in existence .

•

gizzlon 8 hours ago

> Sales and Marketing: $5.73 billion .. That is, OpenAI spent 44% of their revenue on sales and marketing!

Anyone know what they are spending this on? Can't remember seeing one OpenAI ad.. Is it just pr and influencers? Ads in the US?

•

9cb14c1ec0 35 minutes ago

If it actually was spent on ads, it seems to me OpenAI would have to be one of the single biggest ad buyers in the world. Almost certainly they are using a somewhat broad definition of sales and marketing to cram a bunch of expenses into it to make some other category look better.

•

shaewest 2 hours ago

I've easily gotten (low) hundreds of OpenAI youtube ads. More recently they've been pushing 'Free OpenAI Image generation' to me, in the past they pushed Codex more, but I have a sub for that now, so I guess it works.

•

zyuiop 6 hours ago

Likely free tokens to attract customers

•

arikrahman 35 minutes ago

Deepseek is practically free if you hit cache. Harnesses like reasonix help to alleviate the affordabiliy concern with opinionated oversight.

•

jwrallie 27 minutes ago

That is an area where the article did not touch, how realistic are token prices with Deepseek and the likes of it? Is it being subsidized?

•

a34729t 8 hours ago

Deepseek is 90% cheaper, and nearly as good for coding tasks as claude/codex, and as good given the right plan.

The only moat OpenAI and Anthropic have is regulation. If the Chinese really eant to hammer us, they could realse the full training data and pipeline.

•

thewebguyd 8 hours ago

Even without doing that the Chinese are already going to impact our labs presence everywhere else in the world. With Fable getting pulled, any model coming out of the US is now unreliable and untrusted. No one in any other country would in their right mind choose OpenAI or Anthropic for anything.

The big push for regulation and export controls is only going to ensure OpenAI & Anthropic are more like the automakers. Only in business because of protectionism, left to screw over US consumers meanwhile the rest of the world gets to enjoy cheap EVs

•

a34729t 7 hours ago

I have to push back on this: China's cheap EVs and power prices are due to industrial policy on an epic scale which goes directly against the whole free market thing. I personally think industrial policy is a good thing, but you cannot have it both ways and not expect workers to get unhappy and vote against your interests when they have no more jobs.

•

xboxnolifes 2 hours ago

Even within the definition of free market there exists degrees of freedom. On one side, that id expect most Americans to be familiar with, is laissez faire capitalism. Im not sure how far it goes into the other direction before it stops being considered a free market, but if the existence of government incentives on the market stops it from being a free market, the US is also not a free market.

That is to say, I believe free markets can exist along side government policy.

•

thewebguyd 6 hours ago

True, and its what they are going to do with LLMs as well. We know their playbook by now as they've repeated it over and over again across different industries.

But we can still protect domestic workers without screwing over consumers. Pure protectionism doesn't work, it'll only set us back and keep us behind. Just slapping on 100% tariffs or a complete import ban just lets domestic companies get lazy. The protectionism needs an expiry date so they can't hide behind it forever. We could also work to move supply chains out of adversarial nations and into friendly ones, but you know...that requires us to continue to have friends and allies.

A fully free market has been an illusion in the US for a very long time. We'd do well to do some of our own state-industrial planning.

•

jschveibinz 9 hours ago

I don't have a crystal ball, but based on similar historical scenarios, I think that one or two of these companies will win--probably because of some unique application, delivery or trade secret that will drive 80% of their revenue.

Consider Google, Apple, Amazon, etc.

It's still early days...

•

CuriouslyC 9 hours ago

The US govt is going to ban foreign models and foreign providers, and frontier labs are still cooked, because US companies will RLwash Chinese models to try and get in on the captive market. The frontier labs have already lost the war for coding, their next play is custom models for specific domains... Anthropic Galen for biomedical research, Anthropic Locke for legal analysis, etc, and you won't see _ANY_ intermediate work on the model, you will put in query, maybe get some questions fired back during work, and get a "final report."

Eventually the frontier labs will try to cut out the middle man once these models prove themselves and start doing partnerships with big firms in the domains, so they can take a % of the profits in perpetuity rather than just taking a one time payment. For example, after Anthropic Galen, they'll do a partnership with Pfizer to generate Ozempic-Superjacked and take 20% royalties on global sales.

•

hackingonempty 9 hours ago

> The US govt is going to ban foreign models

The people have a right to make and use whatever models they want, protected by the constitution. At a minimum, the models are described in research papers that are unquestionably protected speech. Skilled devs turn those into programs, also protected speech.

•

BigTTYGothGF 7 hours ago

> protected by the constitution

I don't see how.

•

8n4vidtmkvmk 9 hours ago

How could Trump ban tiktok then? And Fable for that matter.

Maybe you're somehow legally allowed to distribute and download the weights, but most of us can't run GLM 5.2 at home.

•

wqaatwt 2 hours ago

They “banned” the company, not the product. US government could ban US citizens from buying tokens from certain Chinese provider but there is no precedent for banning the usage of specific software if you run it yourself.

•

hdgvhicv 2 hours ago

Decss

•

verdverm 8 hours ago

You won't need a frontier size model for most tasks before long. Qwen 3.6 (small) punches way above its weight. I run it at home @8bit on an OEM Spark

•

kappar 8 hours ago

Second this, I am also running qwen 3.6 35b Q8 on a 5090 liquid getting around 250 tokens / second and it is plenty capable. I actually haven't even looked at models recently because I am happy with what I have.

And.. now I feel the need to look again. Darn, there goes my afternoon

•

dualvariable 8 hours ago

And corporations could run DeepSeek models on cloud hardware.

•

verdverm 7 hours ago

You can run most open models on cloud hardware. Google Cloud gives you a click to deploy, but then you have saturation / ROI considerations, versus Google serving them up multi-tenant, per-token.

•

dualvariable 6 hours ago

For ROI though you can run 24/7 agentic-style workloads, constantly churning through all your source code looking for security bugs (or whatever) and you DONT pay per-token costs.

A DeepSeek instance running 24/7 in a cloud provider will beat doing that with Claude which could bankrupt you with 100x more costs, even though it might find more.

And DeepSeek may find enough to keep your engineering team saturated and busy fixing things.

•

verdverm 3 hours ago

This process works better with multiple models and not simply slinging Ai at it 24/7. It's not an ROI just because you keep the GPU busy. The signal to noise ratio is not there yet

•

skywhopper 9 hours ago

The US government isn’t supposed to be allowed to constrain speech, but they do have the power to constrain commerce, and they can ban the sale of AI services and AI-capable hardware if they choose.

•

andrekandre 7 hours ago

  > but they do have the power to constrain commerce

its an interesting idea; i'd like to see someone claim buying/selling as a form of speech...

•

Supermancho 3 hours ago

Citizens United got pretty close in the Supreme Court's 5–4 decision on January 21, 2010, ruling that corporations and unions cannot be prohibited from making independent political expenditures, citing First Amendment free speech protections.

•

andrekandre 34 minutes ago

yea, that was the hint in my comment... just waiting for the other shoe to drop so to speak

•

giantrobot 9 hours ago

The current administration has repeatedly demonstrated they do not feel constrained by laws or the Constitution.

•

tmpz22 9 hours ago

Yes our passionate defense of Academia will surely survive Techno Oligarchs desire for a 20th vacation home

•

woeirua 7 hours ago

>The frontier labs have already lost the war for coding

This is a delusional take. Sorry, but anyone claiming this hasn't used Fable and compared it to the current best open source models. I see a lot of hype posting about GLM5.2. I see absolutely ZERO people using it in production compared to GPT 5.5 or Opus 4.8.

•

CuriouslyC 6 hours ago

Coding agents are edging into diminishing returns for common tasks. The whole Opus 4.5+ arc shows this for a large swathe of people. Chinese Fable is likely <=6 months away. US Frontier labs are structurally disadvantaged in the long term so advantage is only going to skew China.

•

Analemma_ 9 hours ago

> The frontier labs have already lost the war for coding

You are way too deep in the HN bubble.

•

CuriouslyC 9 hours ago

I'm looking at how market/human forces are going to make the game play out when extended to its logical conclusion, not the score on the scoreboard RIGHT now.

•

com2kid 9 hours ago

So long as Chinese labs keep writing white papers, trade secrets aren't going to win the day.

Having growth up in the 90s, it is weird seeing companies share their technology secrets publicly.

•

dofm 9 hours ago

Wandering around pretending to be researchers who are only just figuring out how to make money is, for the short term, an incredibly good way to attract a load of naïve money; not all sharks are smart.

And it does, nowadays, give you a bit of a veneer of mere curiosity when you're being accused of massive theft.

•

sowbug 8 hours ago

We're seeing the first 20 years of the dot-com cycle, but compressed into two years, and trying hard not to fall into the tar pit of ad-supported services.

•

dualvariable 8 hours ago

I'd guess Anthropic will probably win, and LLMs will probably still be with us and be much better in 10 years time.

But next year we could be in the middle of a massive $600B/yr capital-spending bubble deflating hard with unemployment accelerating towards 10% (or higher).

The internet never failed, but the telcom/dotcom collapse still happened in 2001.

•

kajman 7 hours ago

I'm sure investors thought one or two of the ISPs laying all that fiber would be collecting fat rents on them until the sun burnt out. I'm glad they got so much in the ground before there was a reckoning. I hope this industry ships more very expensive models, ASAP.

•

gexla 31 minutes ago

Whenever I see the cost indicator in my harness while building some probably useless thing, I'm reminded that probably everyone else is doing the same thing. Spending loads of imaginary money (according to the harness, I spent $10 worth of tokens for this single issue, but I have a $20 sub) building something of imaginary value. And then I go on social media and see a wall of slop posts, many talking about skills, systems, agents, and "Karpathy Wiki Systems" to build more useless things. Then I hear about all jobs going to AI, and I figure surely someone has to be the sane one to direct people not to build useless things. Surely you can't leave that up to the AI sales guy. Every single idea I have ever passed by GPT is BRILLIANT! I don't know where I'm going with this. At least it wasn't written by AI. ;)

Edit to add: Just use Deepseek Flash 4. You can hit those servers all day for next to nothing and still scratch the itch to build useless things. ;)

•

titzer 9 hours ago

The coming AI enshittification is going to be epic. For those of us who have been on the web for more than five minutes, we can see this a mile away.

If you think search ads are annoying, pre-roll YouTube ads are annoying, streaming ads are annoying, or basically ads-on-any-screen-anywhere-at-any-time are annoying, just wait until every stupid thing is powered by AI and is subtly trying to manipulate you to buy/watch/believe some crap all the time.

•

VorpalWay 60 minutes ago

I'm not sure how that would legally work in EU: several countries have strict rules about ads having to be clearly distinguished from non-ads material. I know that UK has pretty strict rules about product placement in broadcasts too for example.

Yes they can do ads, but if they try to be subtle they will likely (eventually) be hit with fines.

Though, do the current rules apply to AI? Likely unclear. But if this becomes a problem I would expect new consumer protection regulation to be introduced aimed at this specific issue.

•

asdff 4 hours ago

It is already ruining forums including this one. People posting AI slop articles. Probably a good deal of commenters using AI slop to write their comments. Can't run away from it. Eventually you will have to step away entirely because you will just be interfacing mainly with AI content. I'm about fed up with forums at this point and I doubt I will be around them in the coming months at this rate of ever increasing slop peddling. I suspect more and more human contributors will feel the same and start slowly walking away, snowballing the slopification effect.

There is going to be a point soon where HN is just ai models posting ai articles to be filled with ai comments and for what reason exactly? I guess to try and train new ai slop company products into the datasets of various ai models to capture the budget spend of some ai middle manager model.

•

mannanj 52 minutes ago

Wasn't this the end goal in the a totalitarian state in the novels by Orwell and sort of described in the movie Ready Player One?

The gist of it as I understand it is in a society where things are fake and incredibly extractive (where a select few, bourgeoisie or rich prioritize their interests over others like we see accelerating today) they limit the forums available for people to question them and peddle their interests on the select few that remain. If you sufficiently isolate the people, it's hard for them to tell whats real and eventually they come to accept the fake narratives as truth.

In an odd way, this sort of fits in with the theories about winners writing history, and all those weird, sort of conspiracy-laden accounts of human history having these odd unexplainable gaps or stories around it. I don't know about you, I think we are simply seeing those forces of the past working at preserving their interests and using the latest technology to do it.

•

bryanlarsen 8 hours ago

Jeopardizing a $200/month subscription in return for $1/month in ad revenue seems insane. Using ads on a $20/month subscription to entice you into a $200/month one, OTOH...

•

titzer 8 hours ago

They will almost certainly get more than $1/month in ad revenue for someone interacting with the AI for hours a day.

•

wqaatwt 8 hours ago

This article seems to be struggling with telling apart the difference between R&D and operating expenses? The fact that AI companies are extremely unprofitable doesn’t mean they are subsidizing token costs, they still can have very decent gross margins on them

•

Gigachad 2 hours ago

The accounting is all out of whack with these companies. If a GPU had a useful lifespan of around 3 years than it’s more a constant expense than R&D. They want you to believe they can cut the R&D at any time and be profitable but it’s likely they would completely collapse without that spend.

•

zhivota 25 minutes ago

I don't see why 3 years is the right number there. I'm using a 10 year old model GPU to generate tokens locally, and given the bottlenecks, a commercial model focused on RAM and transfer speeds should have a longer depreciation curve than 3 years.

•

Gigachad 11 minutes ago

Because these GPUs in data centers chew power and take up space. If in 3 years there is a new model that processes far more tokens with the same power and time the economics quickly say the hardware is cheaper to replace than to continue running.

As a hobbiest at home the numbers are different and you can afford to do something inefficient.

•

avereveard 9 hours ago

> Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times

might as well be the other way around with non subscribed token being 50x overpriced, or any combination thereof

also uber was non profitable for the longest time, raking up 31b in losses, on the bet of capturing the market worldwide. scale here is different, but it's also 10 years later, with a lot more volatility and floating cash in the market (voo grew 327% over that period, not unreasonable that round size grew on the same trajectory)

•

mattas 8 hours ago

I can't wrap my head around how revenue > COGS but at the same time AI is being subsidized and the real cost is not affordable.

You don't price based on cost, you price based on willingness-to-pay.

So maybe labs are "overcharging" enterprises on interference (because, up til now, enterprises have seemingly had unlimited budget for tokens) and "undercharging" individuals and SMBs (because they don't have an unlimited budget).

•

raincole 9 hours ago

> OpenAI Had $13.07 Billion In Revenue, $34 Billion In Costs and Expenses, and $20.92 Billion In Losses, with a net loss attributable to the company of $38.53 Billion

This is going to be the new most misquoted/misunderstood data of the year, isn't it? The cost is mostly from a one-time accounting situation due to their pivot from a non-profit organization.[0] If we trust the leak [1] OpenAI is likely turning profitable this year.

[0]: $30Bn of it is the one-time cost. https://www.ft.com/content/e15b0d7e-ff6b-4f16-ba7a-4068feddb...

[1]: I suspect OpenAI itself leaked that financial report. It's almost unbelievably healthy.

•

yalogin 8 hours ago

The issue is the cost is not going to be a hindrance for companies that have gone all in on the AI development. They may still find it cheaper than hiring engineers and if needed they will layoff a few more.

The companies that did not yet jump on this bandwagon and are still evaluating will have a decision to make.

No matter what the AI companies are going to change their pricing strategy and it’s going to become a lot lot more expensive to use. I am just hoping the price stays like this until I am done with my big chunk of work

•

travisb 8 hours ago

I think a lot of the cost comparisons to employees are off by a factor of 2 or more. AI is the ultimate contractor. Available instantly. Doesn't charge during idle periods. Pre-vetted and pre-trained. No contract negotiations or complex accounting.

That is worth a small multiple of the fully-loaded employee cost. So AI might be easily worth more than $200 per human-equivalent hour. With high utilization, that might be $8000-10000 a month.

With that kind of spend, AI provider financials looks less frightening.

•

lbreakjai 35 minutes ago

AI is more like a union that controls the entire labour pool. I'm one out of a few million developers. I've got very limited bargaining power. I can't withdraw my labour because I need it to make rent and buy food. My cost has a very predictable ceiling.

On the other hand, there's two AI labs, that could afford to eat your profit, because what are you gonna do? They're your entire labour force.

•

ilovecake1984 2 hours ago

Agree. But if that’s the thinking then you need to compare vs off shore contract rates, not on shore contract rates.

•

jdw64 8 hours ago

I can't go back to a life without AI, and I don't want to. But if AI were billed by token instead of subscription, my monthly cost would probably be ten times what it is now. I could switch to a Chinese model, but I'm not sure how things will look by then.

What makes AI so convenient is how good it is at doing red-team code reviews on my work. I used to need all this unnecessary communication just to get a review, but now I only have to reach out to the people I actually want to talk to.

•

recursivedoubts 8 hours ago

Once locals get to Opus levels I think it we may see a phase change because that + a reasonably competent programmer is going to be a very powerful combination for most practical programming problems.

Frontier models may eventually achieve super-intelligence (no opinion beyond mild skepticism) but super-intelligence isn't necessary for most practical day-to-day programming. The problems, as always, become communication, understanding what users really need, etc. that is, softer skills.

•

wqaatwt 2 hours ago

> Once locals get to Opus levels

I find it hard to imagine it would ever be cost efficient vs hosted/cloud i.e. you should always be able to run faster and/or better models remotely at a comparable price since its just way more efficient due to batching

•

cheonic52749 8 hours ago

> Frontier models may eventually achieve super-intelligence but super-intelligence isn't necessary for most practical day-to-day programming

I think you forgot what super-intelligence means…

•

recursivedoubts 8 hours ago

tbh, not sure i ever understood it

•

sdenton4 5 hours ago

In discussions of super intelligence and ai takeoff and such, I find it helpful to ask why the smartest humans usually aren't heads of state...

•

LoganDark 8 hours ago

A superintelligence is one that exceeds human intelligence in all areas. Which roughly translates to learning, adapting, and performing more quickly and efficiently than even the best humans. This is closely related to "the singularity", which is when technological growth becomes uncontrollable by humanity.

•

bryanlarsen 8 hours ago

That sounds like a definition designed for goal shifting. This AI is better than human at 99.9% of things, but humans are better at pooping. Therefore we don't yet have a superintelligence.

•

wqaatwt 2 hours ago

A basic calculator is 100% better than humans at all the things its designed to do.

•

LoganDark 8 hours ago

If that AI were given an identical human body (and interface to that body) to someone who had not yet learned how to do that, and it outperformed them in figuring it out, then that would settle it.

Otherwise I don't see the comparison.

If I'm intelligent enough to use a tool, but I don't have the tool, that doesn't mean anyone who does have the tool is automatically more intelligent than me.

Likewise, comparing my performance without the tool against someone's performance with the tool wouldn't be benchmarking their performance, only benchmarking them with the tool's performance. The fairer comparison would be against me also with the tool.

•

GodelNumbering 9 hours ago

I don't see any real point being made in (or point of) the article. The author sort of just...dumped a bunch of links with the noise that is so incredibly mainstream at the moment that I doubt any of it is news to anyone even somewhat tracking the AI cycle. Most of it (except for maybe the BLS[1] stat) is just regurgitation.

[1]: And this too is incorrect, should be " the number of jobs displaced would be around 32.5M" (the post says 32.5K)

•

KolibriFly 6 hours ago

I feel like the author is jumping way too fast from "OpenAI is losing money" to "the whole AI economy is broken." A company being in the red during aggressive scaling doesn't automatically mean the unit economics don't work.

•

atleastoptimal 8 hours ago

These companies biggest source of revenue is per-token pricing though, not subscriptions. On tokens they make a good margin.

•

Quarrelsome 8 hours ago

Is it not also possible that some of the shift is a consequence of increase of use? While we can be extremely cynical at the finances at play, the lock down and increase of token pricing might be demonstrating a burgeoning demand, which would be a positive indicator.

•

cmiles8 9 hours ago

The math doesn’t add up and the wheels are starting to come off the bus.

The conversation in a lot of wealth management offices has shifted dramatically in the last few month from “how do I get in on this AI thing?” to “how do I protect my assets when this AI stuff blows up.”

There’s little question now if this will all implode, just when and who’s going to lose their shirt and be left without chairs when the music stops.

What’s playing out now is the scene from The Big Short where the banks wouldn’t mark down the value of bonds until they secured a short position. Once the big money has their helmets on it will stop providing fuel for the bubble and then look out below!

•

Kotlopou 8 hours ago

With these confident comments I would appreciate some kind of origin of the information. Not even necessarily a source accessible to me, just: are you in any wealth management offices? Or are you reporting other people's opinions? Or does it just sound right given the spirit of our time?

•

surgical_fire 5 hours ago

> There’s little question now if this will all implode, just when and who’s going to lose their shirt and be left without chairs when the music stops.

Well, OpenAI and Anthropic are racing to IPO for a reason.

They will need every bagholder they can get their hands on.

•

jcgrillo 9 hours ago

Assuming the analysis is right, and most (or all) of these AI companies will default on their debts, what consequences might that have?

•

cmiles8 8 hours ago

If that happens the AI companies will first try to negotiate with their creditors and after that likely declare bankruptcy with the creditors taking over what’s left of the assets. Shareholders will be wiped out and employees will be left with nothing. Various franken-companies will emerge from the bankruptcy ashes and the world will move on with AI sans the present irrational exuberance.

•

johnvanommen 7 hours ago

> If that happens the AI companies will first try to negotiate with their creditors and after that likely declare bankruptcy with the creditors taking over what’s left of the assets.

Due to the fact that we’ve already done this before (Enron, Global Crossing) -

I’m willing to bet that there are contracts in place ALREADY, that define what happens in the event of a default.

In particular, I’ll bet that the buildings, the GPUs, the patents, etc…

All of these have probably been accounted for.

I worked at a data center that closed during the WorldCom era, and when they put the padlocks on the door, there were still websites “hosted” from the building.

I don’t know if they killed the power or what. I’d cleared out my desk long before they locked it all up. I wouldn’t be surprised to learn that these websites couldn’t get their own servers, since ownership was tied up in the courts.

In the Bay Area during that time, there were row upon row of empty office buildings.

•

devin 8 hours ago

And then the US government will say that these company's futures are in the national interest, and they will be bailed out with taxpayer dollars.

•

bryanlarsen 8 hours ago

Usually (but not always) such bailouts wipe out the shareholders.

•

cmiles8 8 hours ago

That could happen, but the shareholders still get wiped out.

•

thephyber 8 hours ago

All of those poor agents will be laid off from their support chat jobs and their roles will get outsourced to India and Philippines.

•

NickC25 8 hours ago

>what consequences might that have?

All depends on who is holding the bag, and how big the bag is.

•

thewebguyd 8 hours ago

Hence the IPO. Push the risk on to retail and index funds, away from private credit. Plus Microsoft, Google, and Amazon will also be holding the bag and have huge balance sheet write downs, the compute commitments have not yet been paid for it's all just promises.

The banks aren't has exposed this time, as in 2008, most of it is tied up in private credit, its more akin to the fiber buildout in the 90s.

•

cmiles8 8 hours ago

Yes, and private credit investors are rushing for the exits but can’t get out because of withdraw limits. It’s starting to get ugly. Folks owning something they think is going to tank and they can’t sell.

•

SwellJoe 8 hours ago

Yep, if they make it to IPO, as SpaceX has, and if they manage to get into several indexes (as SpaceX is already doing, I assume it's already in the Russell indexes, and will soon be in the Nasdaq 100 index), it'll be a bunch of working class people's retirement accounts holding the bag. And, those same companies might be deemed Too Big To Fail, and they'll get even more working class folks money in the form of tax-funded bailouts.

A wealth transfer from the working class to a handful of billionaires bigger than any the world has ever seen (and the world has seen a lot of wealth transfer from the working class to billionaires).

•

LPisGood 8 hours ago

Probably 401 (k) plans for the most part.

•

LastTrain 8 hours ago

Yes. If we spend more on building AI infrastructure then current total global gross software sales, the only way the math works is if we create and sell much more software or if we start charging more for it.

•

largbae 7 hours ago

This article gave me an amusing thought: the only jobs with a high enough salary to be profitably replaced by AI might be software engineers.

•

evrydayhustling 8 hours ago

The willingness to throw capital at AI is definitely doing some crazy things, but this article has some bad takes on the data.

> [Ratio of per-token cost to subscription cost] means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times

Actually, they could be subsidizing by more (if they are taking a loss on API), or not at all (if they are soaking API customers by a massive margin).

Separately, these subscriptions get sold to large groups with varying usage, so it's crazy to model assuming every subscription is maxed out. Banks, gyms, and many other businesses work this way, offering consumers flexible access to services that they will realistically use in bursts. It's not always worth the complexity to prevent overuse by a small minority. You can feel like this kind of business model isn't as transparent, but it's silly to pretend it can't work.

> OpenAI spent 44% of their revenue [$5.3B] on sales and marketing! The hype needed to keep the AI bubble inflated is incredibly expensive.

Over that same period (2025), OpenAI added $10B in realized revenue and $14B in run-rate. Sounds like they're getting >2X return within 12 months of those go-to-market dollars. Compare that to like, any other business.

> Thus in recent weeks the idea that Generative AI (LLMs for short) is too expensive has been all over mainstream business media.

Would it be smarter for these companies never to test customers' price tolerance? The quotes following this make it seem like the companies are getting important information about the nature of that price tolerance, and preparing to react. This is the work markets do on both sides to understand the value of a new product.

There are lots of good arguments about AI overinflation, but in order for them to be useful, they have to be rigorous and targeted.

•

Catloafdev 9 hours ago

Affordability is not the current goal.

Vendor lock-in is the current goal. Consumer prices are a drop in the bucket comparatively.

•

woeirua 7 hours ago

How can you lock in when the harnesses are basically thin clients around the APIs and you can replicate them using agents in a short period of time? I haven't seen a compelling thesis yet for how you achieve vendor lock in for LLMs. Claude Code is a bit sticky, but if we're being honest its just because Codex doesn't have all the same features yet.

•

Catloafdev 3 hours ago

Because it's not about consumer lock-in, it's about enterprise lock-in. That's what they're chasing. Regular users are basically marketing.

•

dofm 9 hours ago

Luckily the industry is much too wise, after a couple of decades of cloud infrastructure, to willingly opt to make itself entirely dependent on one of two platforms with opaque and complicated pricing. We've learned our lessons, oh yes

•

rconti 8 hours ago

Maybe they just need the competition to run out of funding first?

•

hk__2 9 hours ago

That’s an impossible goal; it’s too easy to switch models.

•

downrightmike 8 hours ago

And Microsoft forced M365 subscriptions to include AI for +$30/license.

Cheap, but gave them a massive user base they can claim is using AI

•

dzogchen 6 hours ago

> To generate the $309 billion needed to service their debt, the AI industry will need to replace 46.8 million jobs, equivalent to around 27% of the current number of jobs in the US.

Lump of labour fallacy spotted.

•

jongjong 2 hours ago

The current situation reminds me of how far we've come from old ideals of delaying gratification today in order to have more later.

It seems like this ideology has been corrupted into a short-sighted "Establish a monopoly position as soon as possible at all costs, don't worry about tomorrow."

It's ironic because monopolizing a sector by investing heavily and suppressing profits used to be a long term move but it seems to have become a short term move as investors are racing each other.

•

zytoon 8 hours ago

This summarizes half of the entire AI scene as these guys generate content to paint the entire world the way like to: US equity markets are facing three IPOs .. each led by a world-class bullshitter”.

•

zoobab 9 hours ago

Spelling mistake:

"a return on these invetment"

•

netdevphoenix 9 hours ago

It's Proof of (human) Work. Much more useful than having a sticker saying "Done by a Human".

•

pluralmonad 9 hours ago

Is deleting a letter after an LLM generated the article an insurmountable task? These quaint signals only screen out the lowest of effort slop writers. Better than absolutely nothing, but barely.

It does remind me of the time a chef told me when he puts lemon juice over a dish, he would intentionally not remove any seeds that went on it because it was a signal of quality. I wonder if future slop chefs will intentionally place seeds on dishes that came from a box...

•

Insanity 9 hours ago

Ask your LLM to 'write like a phishing email' to have it seem more human.

I'm actually curious if this works, haven't tried but I assume it would.

•

erikschoster 9 hours ago

...and "maiinstream" -- seeing glaring typos (easily caught by spellcheck) now makes me wonder: did they decide to leave them in (or add them explicitly) to signal they didn't use AI to write, or (the more paranoid option) did they tell the LLM to add a few typos...

I didn't get the sense this was LLM-written, but typo-signalling is... I donno a bit weird. Firefox is underlining some of the words as I write. I'm leaving "donno" unchanged even though it's flagging it as a misspelling but I suppose I'd still opt to fix something like "maiinstream" even at the risk of potentially seeming more LLM-ish!

•

sleepybrett 9 hours ago

It's funny when you watch the doomscroll all these anthropic guys talking about how you should be writing self-improving loops and that's all they do. Of course that's all they do, they don't have to pay for their tokens.

•

manapause 8 hours ago

Can confirm, my experience in “loop engineering” was “this is neat” for 45 minutes until a daily ration of tokens was evaporated. The quadratic cost trap is prohibitive to experimentation.

As a localLLM evangelist, I am hopeful this will bring more attention to the joys of rolling your own sovereign AI.

•

sleepybrett 8 hours ago

Yeah, i'm hoping that gets smoother. I've been experimenting with omlx and opencode on my m5x64gb and keep running into issues w/ Qwen3.6-35B-A3B-MLX-8bit exceeding it's memory limit at the most inopportune times. Playing with 12B gemma4 (8bit) more today.

Maybe I should be aiming for something targeting 48gb of memory?

•

manapause 7 hours ago

It depends what your goals are and what you are using it for. This space is fluid and my answer last week would be different than my answer today! That said there’s no substitute for hard work, here are some resource to get you up to up to speed:

https://carteakey.dev/blog/local-inference/local-llm-optimiz...

https://botmonster.com/ai/self-hosted-ai-agent-frameworks-20...

Personally I find myself swapping models depending if I am engaged in “trad-development” vs building agentic probes or apps involving imagery. Tailscale the LLM to your deployments and ta-da!

•

deweywsu 8 hours ago

I know a lot of level-headed engineers here may not side with me, but I say let the companies who abandoned their people at the drop of a hat, with CEOs who waved their flag around on social media, proudly declaring how they'd now run their companies with 75% fewer employees wither and die. If I had been let go, there's no way I'd go back to a company like that, and there should be a black list of CEOs who acted this way established and kept public. These CEOs are not holistic thinkers, and are too susceptible to mass hysteria and too irresponsible to real people and their lives to be trusted with the vision for any company ever again.

•

oreally 8 hours ago

Someone should keep track of a public database of CEOs who cut workforce while making huge profits. Name, context, situation and all.

•

deadmutex 8 hours ago

Unfortunately, maintaining an opposite list would probably be easier.

•

CamperBob2 7 hours ago

That's basically what you'll see if you open a newspaper to the stock page. That's the idea behind business. It's why you have what you have.

•

knollimar 7 hours ago

Depends on if it's shortsighted by giving up your intellectual capital for short term profit.

•

SamuelAdams 8 hours ago

GM just did this in the last 30 days [1], and their sales are likely going to be just fine. In fact the auto industry has repeatedly automated jobs over the last 100 years, and they still make decent sales numbers.

If you decided to boycott every company that replaced staff with automation, you would be forced to exit the economy. Every company does this to some degree and the customers who vote with their wallet do not seem to care about a reduction in force.

[1]: https://arstechnica.com/ai/2026/06/gm-installs-robots-at-fla...

•

MadxX79 8 hours ago

Robots that replace auto industry factory workers exist; the CEO of GM didn't imagine them as part of some sort of business media induced psychotic episode.

The same is not true for the software industry execs.

•

pragmatic 8 hours ago

GM is running 0% interest, no payments until n deals right now.

That’s usually a sign that sales are not “just fine”.

•

coryrc 8 hours ago

They always are.

•

smahs 8 hours ago

The above comment, to which you responded, wrote about CEOs who responded to mass hysteria, not those who automated anything.

•

johnvanommen 8 hours ago

> GM just did this in the last 30 days [1], and their sales are likely going to be just fine. In fact the auto industry has repeatedly automated jobs over the last 100 years, and they still make decent sales numbers.

I worked at Verizon during their layoffs last year. Biggest layoffs in the USA.

As someone who’s been laid off before, I knew that it generally boosts the stock price.

I bought VZ because of that. It’s up 15% since the layoffs.

Microsoft, an AI stock, is down 30% in the same timeframe.

•

deweywsu 8 hours ago

This is true, and I'm sure AI cuts will continue, but it's obvious that the ones who went "all in" at AI's mass introduction were drinking a special kind of Kool-Aid reserved for the truly sycophantic Wall Street lap dogs, not the CEOs who think about risk and are cautious about betting the farm on a relatively new and mostly untested technology. GM is over 100 years old, and no doubt released improvements that were well-tested and predictable, because you don't take massive chances with a company that well established. It was a couple years into the mass AI deployment that studies on the minimal overall productivity gains of AI even started to come out(!) This was "get on the bandwagon" thinking at a massive scale, which shows you how many CEOs are not independent thinkers at all, but are really just followers. Yes, use AI, but do it responsibly, never forgetting that your investors aren't your only stakeholders - so are your people.

•

caconym_ 7 hours ago

I'll believe it when I see it, but I would love to see it.

•

HDThoreaun 9 hours ago

I really can’t stand when writers point to the difference in price per token on the api and subscription and use that as evidence that inference loses money. This author even says it’s implausible that the api charges 4x marginal cost when I think it’s very likely even higher than that. The entire rest of the post sits on this faulty assumption. Fixed costs don’t matter when marginal revenue is profitable and growing rapidly. The ai labs only have 2 questions. Can they prevent users from switching to open source models? Can they scale the number of users on enterprise plans the way they did for coding but in a more general way for all knowledge jobs?

•

jimbokun 9 hours ago

Then what are the real costs?

•

martinald 9 hours ago

Wrote this a while back. https://martinalderson.com/posts/no-it-doesnt-cost-anthropic...

OpenRouter is the best guide to real costs.

•

jimbokun 8 hours ago

Thanks, that’s exactly what I was looking for!

And much more informative than the speculation and guessing in the article.

•

anthonypasq 7 hours ago

agreed, this doesnt even account for prompt caching or the fact that anthropic has substantial proprietary efficiencies on their inference stack specific to their models and scale.

•

bcjdjsndon 9 hours ago

> Can they scale the number of users on enterprise plans the way they did for coding but in a more general way for all knowledge jobs?

Do these knowledge jobs have a significant corpus of not only knowledge but discussion and problem solving, all conveniently labelled for the AI to train on? Probably not. Coding has stack overflow, what does, say, advertising use?

•

HDThoreaun 9 hours ago

I agree this is a hard problem for the labs. I would be hesitant about “probably not” though. There is just as much marketing copy floating around as there is coding training data. I struggle a bit in this question because I’ve only ever worked as a software engineer, so I can’t exactly make claims about all the work other jobs do. But, one example is I was talking to a doctor friend of mine the other day. He was talking about how he had to take his recertification exam recently and put the questions into chatGPT and thought it gave answers that were generally more thoughtful and correct than his own. Does that mean doctors are done? Of course not, but he’s now pushing hard for more ai tool use in his practice.

•

warkdarrior 9 hours ago

> Coding has stack overflow, what does, say, advertising use?

Advertising has centuries of print ads, 100 years of radio advertising, 70 years of TV commercials, etc. And modern AI does not necessarily need labeling.

•

trollbridge 9 hours ago

The article fails to mention DeepSeek, Alibaba, Qwen, Xiaomi, MiMo, z.ai, or GLM. It's hard to take such an article seriously that doesn't do this. (Our monthly total spend is around $180 with a team of 6, about half technical; our biggest line items are for American models or subscriptions which we probably will be planning to get rid of.)

And then remarks like this:

  Anthropic, OpenAI and Microsoft have all now transitioned customers from subscriptions to token-based pricing.

Huh? I use OpenAI via a subscription, as is anyone else using GPT-5.5-Pro who isn't a multimillionaire.

•

jwolfe 9 hours ago

They're referring to Enterprise customers, though should have been clear about it. Enterprise plans on Claude for example no longer include any baseline tokens. It's 100% usage based pricing.

•

trollbridge 7 hours ago

True, but my friends in Enterprise still just purchase Claude Code subs and expense them. They basically get an allowance of $500 or so per month to buy various tools, and of course are banned from Chinese models. (Claude, Codex, Antigravity allowed, basically.)

•

junior44660 9 hours ago

> Our monthly total spend is around $180 with a team of 6, about half technical; our biggest line items are for American models or subscriptions which we probably will be planning to get rid of.)

Please tell more :). Do you pay per token from bedrock / openrouter / somewhere else? How many tokens you use over the month, and how many for each task? Which harnesses?

•

trollbridge 7 hours ago

Pay for DeepSeek directly. One developer insists on having his own account and in theory expenses it, but he forgets to turn in $10 expense reports. (Total spend in last two months = about $45.)

Pay for OpenAI Pro directly, but I’m the only guy that uses Codex. $100 a month. My nontechnical partner likes to talk to ChatGPT 5.5 Pro for image related tasks (think generating interior decorating pics).

The nontechnical staff use a Gemini account on a Google family AI Pro sub. I use Antigravity when working on Android or Google Cloud API codebases.

Everyone gets OpenCode Go. The cost is trivial. $10 a month per person.

Pay for MiMo directly. We use it during Chinese off peak hours though. Total spend so far $25 in last month.

We run a few Qwen models locally and pretty much have them pegged all day. RTX 5090 on a PC and a Mac Studio.

There’s also Grok which is used for Imagine for artistic / graphic design related work. I also use the subscription for a vision model in my oh-my-pi harness.

We’re having discussions about how to pull in GLM-5.2 cost effectively. We compete with third world development shops so we can’t really pass on inference costs, but we can benefit from getting jobs done for customers faster. But ⅔ of our work is either internal or open source projects we can’t bill for.

•

stavros 9 hours ago

Not the GP, but I use Opus for planning, Deepseek for actual coding (implementing the plan) and GPT for review. GPT is inexhaustible on the $20/mo plan, Deepseek is dirt cheap (maybe $10/mo) and Claude is Claude.

•

junior44660 9 hours ago

GP is talking about API / token-based prices, that's why I asked.

•

stavros 9 hours ago

I don't know, he said "subscriptions" in the line items, but eg I use Deepseek via the API.

•

junior44660 9 hours ago

Ah maybe you're right.

I can manage this budget with the chinese models in AWS BedRock. However, in my experience, they aren't as good as claude today.

•

cdata 9 hours ago

I think the author is referring to enterprise customers. You aren't the "customer" in this case; you're the bait.

How do you know that the other models you are referring to aren't subsidized?

•

skeledrew 8 hours ago

Subsidizing makes no sense when there's no - possibility of a - moat. Although it's very possible that China in general subsidizes Chinese labs in some way so they maintain pressure on US labs. But you only have to look at proxies such as OpenRouter to see that the individuals aren't doing any subsidizing on per token costs.

•

SirFatty 9 hours ago

"Crisis"

•

NitpickLawyer 9 hours ago

The Token Tension :)

•

holyknight 9 hours ago

Most of the "affordability" and "pricing" discussion is pointless because we don't have any real numbers on their margins per token. So, yes, they are subsidizing their subscription plans compared to the API prices, but the API prices could already be stupidly inflated, so the relative price comparison is a nothing burger. Until we know (or at least get a hint) on their margins on API prices, any pricing discussion is pointless.

•

kingstnap 8 hours ago

I don't understand this line of reasoning at all.

We have a pretty good idea of how much it costs to serve these models. You can pencil out the economics and guess at the model sizes and we know pretty decently how expensive the hardware is.

This like claiming it's meaningless to guess the margins of a restaurant without going into their books and seeing the exact recipets and recipes.

They ain't doing dark arts in the back. You can guess at what goes into the food based on similar recipies and how much that costs based on what you pay at the grocery store.

•

1vuio0pswjnm7 2 hours ago

Here are all the references in this blog post

https://sequoiacap.com/article/follow-the-gpus-perspective/

https://sequoiacap.com/article/ais-600b-question/

https://www.wheresyoured.at/brokenomics/

https://www.wheresyoured.at/exclusive-openai-financials/

https://www.wheresyoured.at/news-microsoft-to-shift-github-c...

https://archive.is/m5MHe#selection-1483.0-1483.74

https://www.youtube.com/watch?v=MNQDrF0HjtI

https://www.youtube.com/watch?v=VBHSjzHW-C8

https://www.derekthompson.org/p/the-great-ai-cost-panic-of-2...

https://www.tomshardware.com/tech-industry/artificial-intell...

https://blog.dshr.org/2025/10/depreciation.html

https://x.com/ThierryBorgeat/status/2060069195975422281

https://wlockett.medium.com/the-ai-industry-is-panicking-db5...

https://www.sofi.com/learn/content/average-salary-in-us/

https://www.theglobalstatistics.com/united-states-labor-stat...

https://www.bls.gov/news.release/pdf/ecec.pdf

https://www.businessinsider.com/ai-bubble-heads-doomers-sam-...

https://www.wsj.com/tech/ai/openai-considers-drastic-price-c...

https://www.bloomberg.com/opinion/articles/2026-06-11/anthro...

https://arstechnica.com/ai/2026/06/anthropic-pauses-token-ba...

https://x.com/bcherny/status/2040206441756471399?lang=en

https://code.claude.com/docs/en/agent-sdk/overview

https://windowsforum.com/threads/microsoft-plans-june-30-202...

https://www.datacenterdynamics.com/en/news/anthropic-to-use-...

https://techcrunch.com/2026/06/05/google-will-pay-spacex-920...

https://backofmind.substack.com/p/tokenalysis-and-john-henry

HN commenters quickly attack anything from Ed Zitron these days

But this seems to be flying under the radar

•

simianwords 9 hours ago

This is basically bunk because AI costs have gone down by 50x or more (api costs) since 3 years.

•

mikgp 9 hours ago

This doesn’t solve the problem because (tautologically) the more AI prices go down the less money the companies make. If right now today the companies are operating at a profit and a price war causes the API costs to sink 90% next year, and their capex amortization costs stay fixed.

The math doesn’t math.

•

skeledrew 9 hours ago

AI prices going down means the models are improving, particularly from the efficiency angle (which is inevitable, given the nature of tech). That means all they have to do is maintain a large enough customer base at a rate high enough to ensure loss decreases continuously over time, until eventually the pass the point where they're just gaining. Healthy competition ensures that improvement savings are actually passed on to users in a measured manner, so they don't become too greedy in trying to get to and increase gains.

•

mikgp 8 hours ago

But now you’re describing a commodity, and the competition will erode profits, and their valuations are bananas, unless someone can find a business model that truly differentiate and creates a moat.

•

simianwords 8 hours ago

Models are not commodities and are famously non fungible. Each model has its quirks and strengths, weaknesses and idiosyncrasies.

I know because I see how people went over the 4o model. I can see opus behaving clearly differently enough that I pick it for certain tasks.

•

mikgp 8 hours ago

Is this really for comparable models though? Will folks at scale continue to choose Anthropic frontier AI model if OpenAI releases a similar generation at a 90% discount with comparable capabilities? It feels like the fungibility assumes delineation by capability _and_ cost. No one is choosing sonnet over opus at similar price points.

•

simianwords 2 hours ago

I could get into this line of debate that I find interesting but it doesn't contradict my point that the article itself is wrong and written on false assumptions.

•

Etheryte 9 hours ago

This doesn't really tell you anything useful. AI companies have both built huge datacenters and raised a colossal amount of money. Include caching, quantization and etc. All of those would allow them to undercut on price considerably, even more so if you count in all the users who don't actually cap out their plans. Prices going down doesn't really tell you anything about the production cost, especially in a market where every major participant is happy to burn money just for the marketshare.

•

Npovview 9 hours ago

There are many research avenues which are open which reduces cost dramatically. Smaller task specific/ language specific/ domain specific models, in fact they could even be better. The earlier computers were the size of a building. So prediction based on current state into the unknown future possibilites is wrong. The hardware will be all the more valuable if cheaper ways to run become possible. The hardware gets cornered in a sense.

•

bcjdjsndon 9 hours ago

Because of it's unpredictability and massive dependence on the training data, when LLMs start hallucinating most of the time the only fix these "engineers" have is to feed it another LLM... The genius was the transformer architecture, and evidently none of us have a damn clue how it works

•

esseph 9 hours ago

Every 6-12 months or so we get an increase in one or more of things like: compute power, compute efficiency, GPU power, GPU efficiency, network bandwidth increase, memory speed increase, component density increase in the same form factor, etc.

For awhile it was every 2-3 years you'd start a hardware refresh. As companies moved into more and more training, this timeframe started to shrink. It went from 36 months to 24 months. From 24 months to around 16-18 months. Last I checked last year, it was at 12 months. I think things may have slowed because of component availability, but otherwise whole data centers would be 6-12 months into full operations before they would start a refresh cycle.

Not to mention the massive increase in power density demand and cooling demand per rack that entails.

So no, "AI costs" have not gone down, in fact they are more expensive on training AND inference than ever.

This is why many are concerned about the heroin drip of api costs into orgs. For the companies that are public, look into their financials. It's gonna hit companies and high volume users like a ton of bricks.

•

beepbooptheory 9 hours ago

I'm no economist but if true don't you have the opposite problem? How do you get people to need X many tokens per day such that you can sell enough to make money? Wouldn't you need an absence of competition for that to be ok?

•

anthonypasq 7 hours ago

the demand for intelligence is infinite. you sound like someone in 1960 wondering what the hell we would even do with the functionally infinite cpu cycles we have available to us now.

•

simianwords 8 hours ago

If you are an AI bear you have multiple techniques with you

- if AI costs go down you can ask how the companies will make profit and then suggest the bubble popping

- if AI costs go up you can ask how people will afford it and then suggest the bubble popping

- if companies actually do make profit then you can say the companies are getting too big and powerful so it’s a bad thing for consumers

Essentially you have left zero to a small narrow path where you are happy with the outcomes.

•

beepbooptheory 6 hours ago

I get your point but it still like begs the question right? If you are optimistic about it all, what is the good narrative? What does it all look like? Billions upon billions of prompts to the finest models every millisecond? And then we have to scale on top of that? To like what end? How many apps do we need to code? How many questions can a single person even ask on any given day? Do I lack some imagination here?

Like what if they don't necessarily have to be super duper money making machines to legitimate how useful and nice they are for you? Is that even conceivable? What if tomorrow we all decided they are more like utilities? Would that change anything intrinsic about them for you?

•

josefritzishere 9 hours ago

Can you cite a source? Everything I've read describes the costing as linear with growth.

•

trollbridge 9 hours ago

The quality of what you can get from DeepSeek V4 Pro for $10 is light years ahead of what you could get for $20 a year ago.

Likewise, the quality of what I can get from a local model like Qwen 3.6 on an RTX 5090 is light years ahead of what I could get a year ago on the same hardware.

•

simianwords 9 hours ago

https://simianwords.bearblog.dev/conclusive-proofs-that-llm-...

•

josefritzishere 9 hours ago

That article seems a bit bogus. Cost per capability is a soft, non-predictive model unlike cost per token which has been trending up.

•

simianwords 8 hours ago

This is just hand waving on the obvious consensus that cost per capability is going down. There’s no doubt about it. Hell you can run a Gemma 4 model on your laptop that mogs GPT 4. But yeah you can use fuzziness as an excuse and ignore the trend.

•

worldsavior 9 hours ago

What?

•

claaams 9 hours ago

He's saying output for 1M tokens on the latest models is $50 now when it used to be $2500.

•

whateveracct 9 hours ago

so how are these labs going to recoup the insane training costs at those prices? even if there is still a fat margin leftover afterwards

•

mynameisbilly 9 hours ago

They also have to continuously train, forever, to avoid model drift. It's not a one and done thing as far as I'm aware.