Are Anthropic and OpenAI rushing to IPO for immediate cash so they can delay the inevitable? Surely this cycle of robbing Peter to pay Paul to pay John to pay Tim must end.
We are only just now getting a taste of the “true cost” of these tokens. Then there is a lack of compute bottlenecking everything. Even now I’m looking at the 7.5x rate of tokens for Opus 4.7
Open models are promising and cost a fraction of what they proprietary models cost which the big two are vulnerable to when companies start to feel the cost of tokens.
Will data centres be built fast enough and powered sufficiently to lower the cost of compute thus tokens?
Is it just a giant Hail Mary to get to AGI ASAP before the economy collapses?
Above all else, I simply feel the models have plateaued. I am noticing productivity loss for tasks I deem as “complex”
I think a LOT of companies really never needed to be on the public market, and its a darn shame that so many go on the stock market, we have this obnoxious culture where you have to fire tons of people if you have a bad quarter just to show you're stopping the bleeding. Companies literally fire and hire x number of people every quarter to keep things going, its ridiculous and unhealthy. Private companies rarely work like this, I'm sure there's exceptions.
Every company I've worked at started off private, and those were their golden years, until some economic hurdle happened so they sold it off to a bigger fish who is on the stock market, who bought them to be more attractive to investors or what have you.
I wish there were an alternative to the stock market where you invest for the long haul, and you cannot take your money out in x number of years. I think this would make more sense. Maybe it doesn't fix the VC peeps want their money back nonsense, but if you could do it even for early stage companies, maybe it could help somewhat.
There are very stable companies in the stock market, like Cocacola. But they are not glamurous and don't give headlines.
And there are enormous fish in the private market, e.g. Cargill.
Stock markets are great if you have a company that needs money to expand quickly, and don't mind to share ownership. Stay away from IPO-jackpot stuff, and it shouldn't be that awful.
That exists already! People often complain as well when a company ends its golden years because of some economic hurdle and ends up being acquired by a bigger fish who is _not_ on the stock market.
Seriously though, I have seen some very large companies like Tibco and Dell go private for an extended period of time as a means of avoiding shareholder nonsense during restructuring.
To read more: https://en.wikipedia.org/wiki/Private_equity
Fixed your error.
Let companies fail, but also lets make investing smarter.
This reads to me like Anthropic anticipating demand and making a commitment to acquire supply. Not unlike airlines committing to future jet fuel purchases, or Apple committing to future DRAM volume.
At the current price or real price? Anthropic said a $200 subscription can cost them $5000 so the real price could be anywhere from 10-30x the current price.
It's likely Amazon is making a fucking killing though.
> Most likely the subscription inference cost is much lower than you expect.
This is probably not true because they'd be screaming it off every rooftop were that the case.
Same deal with the API inference. Even the "profitable on inference" claim is sourced back to hearsay of informal statements made by OpenAI/Anthropic staff. No formal announcements, nothing remotely of the "You can trust what I'm saying, because if I'm lying the SEC will have my head" sort.
Yet making such statements would be invaluable. If Anthropic can demonstrate profitability before OpenAI, they could poach most of the funding. There's no reason to keep it a company secret.
And API inference is only part of the total costs, not even bringing in training and ongoing fine-tuning. If they're not even profitable on inference, how could they hope to be profitable overall.
A bit of google searching later can get us a specific interview. https://www.dwarkesh.com/p/dario-amodei-2
> Let’s say half of your compute is for training and half of your compute is for inference. The inference has some gross margin that’s more than 50%.
But the context, the very previous sentence is:
> Think about it this way. Again, these are stylized facts. These numbers are not exact. I’m just trying to make a toy model here.
Here, Amodei is in effect using weasel words. He is not giving any actionable claims about Anthropics margins, merely plucking an arbitrary number. Why 50%? Is 50% reasonable? Is 50% accurate to the company? Those are all conclusions the listener draws, not Amodei.
> I don't know about SEC rules
The main premise is that, as a CEO, there are some regulations you are beholden to. You're not allowed to announce you've made a trillion dollar profit, sell all your stock, and then go "teehee just kidding". The SEC prosecute you for securities fraud if you do that stuff.
This makes such weasel words as earlier suspicious. Because the exact statement Amodei gives is not prosecutable. He's not saying anything about the company, just doing a little "toy model".
The degree to which it is intentional that this hearsay travels and is extrapolated from "Well he picked 50% because it's a reasonable figure, and because he's CEO, a reasonable figure would have to be a figure akin to what his company can achieve" into "Anthropic has 50% margin", that's up for debate. Maybe it is intentional, maybe Amodei is exactly the same kind of shitweasel as Altman is. Probably he's just a dumbass who runs his mouth in interviews and for whatever reason cannot issue the true number in an authoritative statement to dismiss this misconception.
Hence my original comment; If the real number were better than the hearsay rumours of the number, Amodei would immediately issue a correction; It'd be great for the company. Hell, even if 50% were about the margin, that'd be great! To promote that from mere hearsay to "we're profitable, go invest all your money" would also be huge. Really, any kind of margin at all would put him ahead of OpenAI.
But he doesn't issue a correction. He doesn't affirm the statement. Perhaps he has other reasons for that, but a rather big reason could be that the margin number is in fact pretty bad.
Now, the observant reader will note I am also using a weasel word there. I do not know whether the number is good or bad, your take away should be "it could be bad." Not "it is bad". Go pressure Amodei into giving us the real number.
Anti-fraud regulators like the SEC give an inherent trustworthiness and credibility to CEOs and other market participants. You can trust that they're not lying to you, because they would be sent to jail if they were.
Another example are general anti-fraud regulations; Consider how one would trust North American or European steel suppliers more than Chinese steel suppliers.
It's not that the Chinese are "evil lying people" and Americans are "saints who never lie", it's that you can trust American, Canadian, and European courts to hold the liars accountable by regulations even if you're not in any of those regions. But the Chinese liars won't be held accountable by regulations.
Thus also the opposite, if someone opts out of this credibility granted to them by anti-fraud regulations, their words may not be quite so truthful.
50%+ Margin statements have basically been "We are making 50% on delivering it." This does not include ANY of the costs of getting to this point, training, scraping, datacenters, people and so forth.
They are basically saying "Oh yea, the cost of GAS in the car is only X so charging Y per mile is great margin" while ignoring maintenance, cost of acquiring the car and so forth.
Sam Bankman-Fried, Elizabeth Holmes, Kenneth Lay - and hundreds if not thousands more.
The SEC is a regulatory agency, not able to bring criminal charges. The above-named for the most part had to be prosecuted by the Department of Justice or sometimes state attorneys.
I think if you're not Anthropic and you don't have access to the actual data, then you can't say for sure. A bunch of anecdotes on terminally-AI people on twitter is not making a convincing case for me, IMO.
On the other hand, if similarly sized models cost much much cheaper than this, why, in the world, would Anthropic have much higher costs than that?
Also, counterpoint, maybe they want you to think that they have higher costs so you're more willing to actually pay for it?
Business buyers are paying API prices, not subscription
Disclosure: Work at Microsoft on AI
If Amazon believes that story they’d be crazy not to invest.
In short: per-token charges currently cover maybe 1% of the total costs in this field. To pay ongoing costs, and pay back investors, everyone will need to pay 100x or 1000x the current rates, likely for decades.
There are plenty of seemingly informed people saying the exact opposite, so that's a lot of confidence you're talking with. I have a hard time believing it when we know what open weights models cost to run. And sure, there's training costs, but again many say inference costs are already above training costs.
Gemma-4 26B-A4B + M5 MacBook Pro + OpenCode isn't Claude Code _yet_, but it's good enough that if I were forced to use it I would be fine.
Models are likely going to keep getting better, and as costs go down, demand is likely to rise faster.
Huh? Why would that happen? Indications are that costs will likely go up, especially if currently vendors are selling tokens at a loss.
Even if you generously depreciate the GPU and other hardware, it’s hard to believe inference at scale in April 2026 isn’t highly profitable.
I think you meant dollars of electricity.
https://www.theregister.com/2024/03/18/nvidia_turns_up_the_a...
A Blackwell 8X node consumes about 15kw, let’s up that to 50kw to generously account for cooling and everything else.
A US kWh is something like $0.20, so running that node for an hour costs ~$10.
Nvidia got 30,000 parallel TPS out of DeepSeek-R1 on that node:
https://developer.nvidia.com/blog/nvidia-blackwell-delivers-...
So that $10 buys you over 100M tokens or … pennies per million.
I’m sure these numbers are off, but not by an aggregate two orders of magnitude.
> inference is indeed profitable
The unit economics for today’s frontier models should be great, and this suggests Anthropic believes they’ll get better.
We might see a one time bump in inference when we move off GPUs onto more limited and efficient dedicated hardware, but the sustained fast pace of improvements are far behind us.
But if progress begins to slow down, then the economics work. Maybe Gemma 4 is a good example. It feels really generally useful. Getting it at 1/10th the cost feels like it could be competitive in 2 years.
It’s just that the pace of new stuff is slowing down, and many people are operating under the assumption that this wave will ride on forever.
Has there been a ton of hype? Absolutely but the value proposition is getting more and more tangible.
Did some of the AI companies over commit in spending? I am sure and they will probably hurt in the long term. I thought Anthropic had been scaling towards profitability at a quick timeline though.
Most of this is still structured around "find use cases for AI" rather than one (or more) clear use cases being the reason for adopting AI.
There's no "Lotus 1-2-3" of AI. Even the software development applications are still somewhat controversial and highly pushed based on "Sam Altman promised me 10x developers".
1. Model providers are currently profitable when just counting the cost to serve tokens for inference[1]. They lose money training the next generation of models.
2. Open models don’t work nearly as well. Given that tokens are still relatively cheap, and hallucinations are expensive, I’ve not seen a huge up tick in open model usage for coding agents yet.
3. On the AI economy front, I really have no idea, but AI companies (meta, msft) have already come down in value. It seems investors are at least a little wary of AI over valuation. Of course, the stock market is not the economy, but it’s not clear where warning signs would be. Earnings are healthy.
1: https://martinalderson.com/posts/no-it-doesnt-cost-anthropic...
2: https://www.economist.com/finance-and-economics/2026/04/20/a...
Sure my gross margin might be $2 on each sammie sold but I need to sell 500,000 sandwiches just to break even to be a viable business. The fact is these AI companies are playing the game where they talk about revenue and gross profit per token and just try to wave their hands in the face of anyone looking behind them at the crater they're throwing investor money into.
It's nothing but a gamble for AGI but the grand irony is that if that genie escapes out of the bottle the whole world economy is toast and money becomes meaningless anyway. I just can't comprehend the logic of why anyone is investing in this apart from short term gains.
EDIT: I spent most of the day today pulling an 8/3 cable through conduit and routing it through a crawlspace to run 240V service to my barn for a workshop. If Tay wakes up tomorrow and becomes AGI, how will that help me finish the wiring job? Now extrapolate to almost every single other thing humans do. Even if Tay can write all the world's computer programs forever, it barely means anything for the vast majority of people, and therefore the global economy.
You're absolutely right that an AGI isn't running a cable or digging a hole any time soon, but you're going to have 100 people trying to get their hands on the shovel to get paid for the digging - depressing the wages in those hands on jobs.
[1] Not to impugn such activity! They may make important cosmological discoveries by doing this, but the work likely has no economic value.
Why do you believe that? Better metric would be price per token of open models served by third party. Last I was tracking the price for similar level model was decreasing by more than 10x year on year, and they are 10-100x cheaper than top properietery models.
Sure you can say that you can't compare them but for sure you can compare the top properietery model of 6 months back to current open models and the gap in time seems to be constant.
It's only a matter of time until they crash the market. Nobody is making any money, even if the White House is dumping billions in their tools.
I use Gemini quite extensively - I have a 5TB storage plan with Google so I get Pro thrown in. I also have Github Copilot Pro for IDE integration.
However, lately it feels like I keep tripping the circuit breaker on Gemini more easily and get the message about using up all my Pro tokens for the next 3 hours.
I used to be able to work most of the day before it hit the brakes but I can trigger it before work in the mornings now... that seems to me like they're tightening the usage limits!
I use a Dell Micro PC with an Intel Core Ultra 265 so it's nice and fast but it has no GPU, hence the reason I use Gemini but I'm now starting to think that, despite the RAM cost, before the end of the year I'll buy a PC with a monster GPU in it and run all my AI locally... the direction of travel is clearly heading towards a massive cost increase so might as well get ahead of it: it's not going to become cheaper, that's for sure!
Anthropic are scared of open weight models and need to fear-monger towards you to continue paying for their models.
That's the whole point of their 'safety' marketing narrative, account bans, and Dario being the AI scarecrow scaremongering everyone about nonsense like 'Mythos' towards the world.
'Mythos' is already here in the form of open-weight models that also found the same vulnerabilities as Anthropic did.
Mythos screams of marketing hype, and nothing more. Opus 4.7 isn't really a meaningful upgrade in any sense, other than being more expensive.
Once you can see what something like Qwen3.6-35B-A3B can do... with just a FRACTION of the size of the larger models, You'll understand that the future is open weight models you can run yourself.
Same goes for companies, bringing inference onsite isn't hard, I'm actively building tooling to orchestrate it.
my quick read of the process they describe is that first they asked agents to rank files in order of potential to have interesting bugs, then they launch agents for each file in order of "interesting bug potential" and finally launch another agent for verification. (maybe i am mistaken, this is my read of this post https://red.anthropic.com/2026/mythos-preview/ )
it's not clear to me if they made just one pass over each file or made several passes for same file, but regardless, I think if you recreate roughly same process and burn 20000$ on tokens with other reasonably good model, you will find some fancy bugs too.
I think this can keep going for at least another 5 years.
In a system of open-ended growth, yes, you can point to how long the system has persisted as evidence of its longevity. But in a system of plateauing growth, the system's age is an indicator of how close it may be to death. I suspect that the model that permitted the "success" of Uber and Tesla is nearing the end of its lifetime.
At a very minimum, to repay the +$100b in investment within a reasonable timeframe, what's the minimum figure they have to bank post-tax each month?
Also, investment is not money in the bank. They can’t withdraw $100b tomorrow. That means they don’t have to repay until after they got the investment, which is a commitment over several years.
When you're selling $10 Bill's for $1, then revenue is meaningless.
I could sell $100B of GPUs at 90% of their cost tomorrow and I have market acceptance.
I am completely confident that Amazon of all companies is totally fine with not taking a return for a long time.
Amazon didn't book a profit for the first decade of their company. It's completely modus operandi to burn, burn, burn to get as big as possible.
By all accounts they in striking distance of profitability if they wanted.
It makes sense; Anthropic is by far our biggest vendor expense outside of AWS. And I suspect that is true at a number of companies.
By their accounts they are in striking distance of profitability. Until they go public all we can do is estimate how much they burn by looking at how quickly they need more capital - this latest investment by Amazon ($5b investment with on $100b returned over 5 years) tells me that their previous raises have been spent.
The biggest risk to AI companies IMO is further optimization and distillation of the capabilities into smaller and more efficient models. The moat these companies have right now is that higher intelligence requires more specialized and expensive compute. If you can do that for cheap then it kind of negates their business model. Everything is moving fast, we also yet to see world models/embodied AI and how that impacts thing. I think we've reached the peak with regards to capabilities of pure text trained LLMs.
a) effective price-per-token is rising b) there is insufficient compute to meet the demand.
And your conclusion is that the industry is circling the drain and due to collapse?
a) cost per successful task is rising — eg claude max allocation is functionally shrinking
b) is there enough potential cost reduction in the queue to make up the gap
c) if open models converge on a more efficient but slightly-less capable point (which has effectively happened) what is the actual moat?
And yet - Anthropic is still struggling to have enough capacity to serve demand - they are virtually sold out.
And yes, are almost-as-good open models, on part with the closed models from 6 months ago (at worst), that are just a single Openrouter API call away, and yet Anthropic is still selling out. So people are paying for the premium product anyway, for whatever reason - maybe the last bit of intelligence is worth it, maybe they like the harnesses/products around the models, maybe it's a brand/enterprise sales thing.
Put aside your feelings about the AI industry and imagine we are talking about thingamajigs. Prices for thingamajigs are going up. They are still selling out about as fast (or faster) than the company selling them can build factories. There are more cost-effective competitors already in the market, but thingamajigs are selling out anyway.
Would you, looking at the thingamajig industry, conclude the "jig is almost up"? That "the returns aren’t anywhere close to what investors expect" and that the impending IPO is all some desperate hail mary to save things before the collapse?
What we are looking at looks to me like it is rapidly becoming a a commodity: it will become as existential as electricity and water to businesses, and it will be sold and marketed and regulated, more or less like a utility.
...building datacenters will not lower the cost.
The cost (real, not investment hype subsidized one) will only drop with:
* more efficient models * GPU/RAM market going back to reasonable pricing.
The AI bubble pumped the second into unstustainable pricing and progress on first is going.. slowly.
It feels like these hyperscalers are just raising as much as they can giving extremely rosy projections becauses these sooner or later peak is going to be reached (if that hasn’t happened already)
What does "on time" mean? You'll need to negotiate with local authorities, some friendly, some not. Data centers aren't exactly popular neighbors these days. Then negotiate with the local power utility. Fingers crossed the political landscape doesn't shift and your CEO doesn't sign a contract with an army using your product to pick bombing targets, because you'll watch those permits evaporate fast.
Then there's sourcing: CPUs, GPUs, memory, networking. You need all of it. Did you know the lead time for an industrial power transformer is 5+ years? Don't get me started on the water treatment pumps and filters you can't even get permitted without. What will you do in the meantime ? You surely aren't gonna get preferential treatment from AWS / Google / ... if they know you are moving away anyway. Your competition will.
The risk and complexity are just too big. AI/LLM is already an incredibly complex and brittle environment with huge competition. Getting distracted building data centers isn't enticing for these companies, it's a death sentence.
You're not wrong about the rest but no AI company would ever build a data center in every continent for this, even if they were prepared to build data centers. AI inference isn't like general purpose hosting.
This may be true for simpler cases where you just stream responses from a single LLM in some kind of no-brain chatbot. If the pipeline is a bit more complex (multiple calls to different models, not only LLMs but also embedding models, rerankers, agentic stuff, etc.), latencies quickly add up. It also depends on the UI/UX expectations.
Funny reading this, because the feature I developed can't go live for a few months in regions where we have to use Amazon Bedrock (for legal reasons), simply because Bedrock has very poor latency and stakeholders aren't satisfied with the final speed (users aren't expected to wait 10-15 seconds in that part of the UI, it would be awkward). And a single roundtrip to AWS Ireland from Asia is already like at least 300ms (multiply by several calls in a pipeline and it adds up to seconds, just for the roundtrips), so having one region only is not an option.
Funny though, in one region we ended up buying our own GPUs and running the models ourselves. Response times there are about 3x faster for the same models than on Bedrock on average (and Bedrock often hangs for 20+ seconds for no reason, despite all the tricks like cross-region inference and premium tiers AWS managers recommended). For me, it's been easier and less stressful to run LLMs/embedders/rerankers myself than to fight cloud providers' latencies :)
>then put all of your data centers there
>You definitely don't need a data center in every continent.
Not always possible due to legal reasons. Many jurisdictions already have (or plan to have) strict data processing laws. Also many B2B clients (and government clients too), require all data processing to stay in the country, or at least the region (like EU), or we simply lose the deals. So, for example, we're already required to use data centers in at least 4 continents, just 2 more continents to go (if you don't count Antarctica :)
We're talking about billions of dollars of extra capex if you take the "let's build them everywhere" side of the bet instead of "let's build them in the cheapest possible place" side. It seems to me that you'd have to be really sure that you need the data center to be somewhere uneconomical. I think if you did build them in the cheap place, it's a safe bet that you'll always have at least enough latency-insensitive workloads to fill it up. I doubt that we would transition entirely to latency-sensitive workloads in the future, and that's what would have to happen for my side of the bet to go wrong. The other side goes wrong if we don't see a dramatic uptick in latency-sensitive inference workloads. As another comment pointed out, voice agents are the one genuinely latency-sensitive cloud inference workload we have right now; they do need low latency for it. Such workloads exist, but it's a slim percentage so far.
I believe I'm taking the safe bet that lets Anthropic make hay while the sun shines without risking a major misstep. Nothing stops them from using their own data centers for cheap slow "base load" while still using cloud partners for less common specialized needs. I just can't see why they would build the international data centers to reduce cloud partner costs on latency-sensitive workloads before those workloads actually show up in significant numbers.
After the initial announcement of "fast mode" in Claude Code, did you ever hear about anyone using it for real? I didn't. Vanishingly few people are willing to pay extra for faster inference.
Remember that the time-to-first-token is dominated by the time to process the prompt. It's orders of magnitude more latency than the network route is adding. An extra 200 milliseconds of network delay on a 5-10 second time-to-first-token is not even noticeable; it's within the normal TTFT jitter. It would be foolish to spend billions of dollars to drop data centers around the world to reduce the 200 milliseconds when it's not going to reduce the 5-10 seconds. Skip the exotic locales and put your data centers in Cheap Power Tax Haven County, USA. Perhaps run the numbers and see if Free Cooling City, Sweden is cheaper.
Large data centers consume as much power as a small city. The location decision is about being able to connect to a power grid that is ready to supply that.
Evaporative cooling also needs steady water supply. There are data centers which don’t operate on evaporative cooling but it’s more equipment intensive and expensive.
Latency doesn’t matter. You can get fast enough internet connected to these sites much more easily than finding power.
* data transit across the world can be very slow when there's network issues (a fiber is cut somewhere, congestion, bgp does it's thing, etc). having something more local can mitigate this
* several countries right now have demented leaders with idiotic cult-like followers. Best not to put all your eggs in those baskets.
* wars, earthquakes, fires, floods, and severe weather rarely affect the whole planet at once, but can have rippling effects across a continent.
And frankly, the real question isn't "why spread out the DCs?", its "what reason is there to put them close to each other?".
Heck, look at Facebook. Granted, they got started slightly before AWS, but not by much. Owning all of their own data centers is a huge competitive advantage for them, and unlike most of the other hyperscalers they don't sell compute to other companies (AFAIK).
Again, the commitment is for $100 billion in spend. Building lots of data centers for a lot cheaper than that price should absolutely be doable. Also, geographic distribution isn't nearly as important for AI companies given the way LLMs work. The primary benefit of being close to your data center is reduced latency, but if you think about your average chatbot interface, inference time absolutely swamps latency, so it's not as big a deal. Sure, you'd probably need data centers in different locales for legal reasons, and for general diversification, but, one more time, $100 billion should buy a lot of data centers.
Every single argument you've brought up is irrelevant in the face of billions of dollars. If you intend to consume $100 billion dollars in data center infrastructure, you're going to find a way to accomplish it while cutting out the middlemen.
Meanwhile if you're flaky and never intend to spend that money, you're going to come up with a way to pay someone else to deal with those problems and quit paying the moment they don't.
You'd never do both at the same time. You'd never commit your money and give them control over your business critical infrastructure.
Hence the deal is a sham. The $100 billion are a lie. Thank you for telling us.
You can’t even get the hardware at that scale without months or years of order lead time. NVidia doesn’t have warehouses full of compute hardware waiting for someone to come get it.
They also reused an existing building. Basically, they put 100,000 GPUs into a building and attached the necessary infrastructure in about half a year. Impressive, but it’s not the same as a $10B/year data center usage commitment like this deal.
If you build datacenters, you have to spend that money now.
They're also not paying amazon to order GPUs, they're paying for compute usage of whatever hardware they have.
Colossus initially had ~200k GPUs. 100B buys you ~1 million high end GPUs running 24/7 for a year at AWS retail prices.
They also reused an existing building that happened to be in the right place at the right time. The larger data center buildouts would almost always need new, dedicated construction.
If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition
If they built it themselves and missed projections it's a much more expensive mistake
It's just risk sharing. Infra providers take some of the risk and some of the upside
Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).
The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.
Afterwards Amazon will be milking the machines these commitments buy for nearly a decade. That tradeoff makes sense at a small scale (even up to $X00 million or even billions), but at $Y0 or $Z00 billion?
Color me skeptical. There are plenty of other side benefits like upgrading to the newest GPUs every few years, but again we’re talking about paying for new buildouts with upfront commitments anyway.
* obviously the timelines, scientific risk, and opportunity cost make this completely infeasible but that’s the scale we’re talking about. It’s a major industrial project on the scale of the thirty year space shuttle program (~$200 billion).
> The Anthropic deal specifically covers Trainium2 through Trainium4 chips, even though Trainium4 chips are not currently available. The latest chip, Trainium3, was released in December. On top of that, Anthropic has secured the option to buy capacity on future Amazon chips as they become available.
There is a famous quote from the polish economist Kalecki, that "economics is the science of mistaking a stock for a flow". Essentially this form of lending continues while everybody can make interest payments, and blows up horribly as soon as somebody can´t - as I have no doubt all those concerned are fully aware.
Interesting...
It’s common even for smaller companies to do mutually beneficial business with each other. It’s actually helpful to do business with people who are also your customers because you have a relationship with them and you also have leverage: They are extra incentivized to treat you well because they don’t want to upset any of the other business you have with them.
Isn't that almost all that matters when comparing doing something yourself versus paying someone else, in this case Amazon, to do it for you?
If you’re not sure it’s going to blow the socks off, foisting capital investment on partners is a great deal.
See the difference in companies/franchises that always own the land/building and those that always lease.
Why this versus us being in a temporary bottleneck? Like, railroads became expensive to build everywhere in the 19th century not because we reached Earth's capacity for railroads or whatever, but because we were still tooling up the industry needed to produce them at higher scales.
In the meantime if you work on revenue generating work, that side of PnL is uncapped. So you can either put some engineers on reducing your costs at most by 100% or, if they worked on product ideas they could be working on things that generate over 9000% more revenue.
However there are certain advantages like supply chain that only established companies would have access to. This is also a commitment to spend upto 100B on internal approach and research. I would expect them to come up with their own cpu chip and device design. This will shift the focus to an internal approach. And might make amazon give better prices later down the line
Just a guess.
I do think a ton of businesses would benefit from running their own hardware, but they're not getting five billion dollars to stay on the cloud.
Everybody does right now, right?
But: is it your core competency?
Can your firm afford the distraction?
https://www.anthropic.com/news/google-broadcom-partnership-c...
Wonder if Anthropic is making a mistake by focusing on "consumer" hardware, and not going super specialized.
Comments like yours add nothing to the discussion.
You can throw money and hardware at a problem, but then someone may come along with a great idea and leapfrog you.
Just consider that all major AI providers now use deepseeks ideas for efficient training from that first paper.
edit: I misunderstood, I thought you were implying they designed their own GPUs. nevermind
I distinctly remember reading a big pantie twisting from Sam Altman and Co that Chinese took their stuff, the stuff OpenAI and Co spent billions to create, and used that as the base for $0.00
I mostly see their products as commodity at this point, with strong open source contenders.
Eventually it will become hard to justify the premium on these models.
As people keep pointing out, the moat is insufficient to ward off international or domestic competitors.
So the answer is to try to seek regulatory capture.
Because, as OpenAI is learning [1], you still need to sell it. The tech giants have a seat at the table is mostly because they have distribution down.
[1] https://www.cnbc.com/2026/02/23/open-ai-consulting-accenture...
Now if "fully caught up" means today's level of intelligence is available for free in two years, by then that level of intelligence means very little
The only thing I can see them meaning is what you said, "in a minute the stragglers will be where the leaders were a minute ago", which, yeah, sure.
Or AGI hits and this theory collapses, but that's feeling less likely every day.
Play out a scenario. An open source model is released that is capable as Mythos. Presumably it requires hardware big enough that running it at home is unfeasible. You are imagining that individuals can run it in the cloud themselves for cheaper than api tokens would cost? Or even small companies? And that Anthropic and OpenAI won't be able to cut costs deeper than their competitors while staying profitable?
If it is fundamentally a commodity, that means "running it yourself" also isn't really interesting as a proposition. Many of the world's biggest companies sell commodities. It's a great business to be in if you can sell them cheaper than anyone else.
The value add here isn't the model, it is "having a bunch of compute and using it more efficiently than anyone else".
If Mythos is the endgame, companies won't release open-weight equivalents, and no private individuals have the capital to train such models.
I expect that people on subscriptions can be asked to donate 1 query a month towards an open source distillery.
It should be good enough to distill SOTA models over time.
The result won't be perfect, but it will be close.
Think SETI@home, but it'll be model distillation instead.
Companies bake their workflows into these tools. Internal processes start to be written up around specific tools. Once something works, it gets pushed out at scale for all to copy.
Anthropic hit $30B in revenue and this is just the start of coding being deployed at scale. Hard to look past these numbers at this point
[1] https://x.com/kenshii_ai/status/2046111873909891151/photo/2
Tokens will continue to increase in price until the supply meets the demand. That's going to take a while.
[0]: https://www.tomshardware.com/pc-components/gpus/datacenter-g...
[1]: https://www.cnbc.com/2025/11/14/ai-gpu-depreciation-coreweav...
GPUs do not burn out in three years, H100 rentals are priced at the same level as two years ago, and are effectively sold out. [1]
[0] https://news.ycombinator.com/item?id=46203986#46208221
[1] https://newsletter.semianalysis.com/p/the-great-gpu-shortage...
This is completely not true if you use AWS Bedrock, and applies to both your private that or in a business context. Its one of their core arguments for the service use.
[1] - "...At Amazon, we don’t use your prompts and outputs to train or improve the underlying models in Amazon Bedrock and SageMaker JumpStart (including those from third parties), and humans won’t review them. Also, we don’t share your data with third-party model providers. Your data remains private to you within your AWS accounts..."
[1] - https://aws.amazon.com/blogs/security/securing-generative-ai...
The data isn't the sole point of them, they also are about bringing in users that will encourage the product use in companies and ultimately drive more profitable API adoption within their orgs, and just general diffuse mindshare doing the same.
You can still opt out (except with Google's offering which disables lots of features if you opt out of training).
Here is the thing nobody wants to say out loud or they are too dumb to realize. AI is intelligence, and intelligence has almost never been the binding constraint on productivity.
So you will get no productivity increase from the AI bubble. Yes, you read that correctly.
The test is simple, if raw brainpower were the bottleneck, you could 10x any company by hiring 200 PhDs. In practice you get 200 brilliant people writing unread memos, refactoring things that worked, and forming a committee to rename the committee. Smart has always been cheaper and more abundant than the discourse pretends.
Every real productivity revolution came from somewhere else like energy (steam, electricity), capital stock (machines that do the physical work), or coordination (railroads, shipping containers, the assembly line, the internet).
None of these raised the average IQ of the workforce, they changed what a given worker could move, reach, or coordinate with. Solow old line basically still holds. The output per worker grows when you give the worker better tools and infrastructure, not better neurons.
Meanwhile the actual bottlenecks in a modern firm are regulatory approval, legacy systems, procurement cycles, customer adoption, internal politics, and physical supply chains that don't care how clever your email was. A smart brains intern at every desk produces more artifacts, not more throughput, and in a lot of organizations, more artifacts is actively negative ROI.
Jevons does not save you either, cheaper cognition mostly means more slide decks, not more GDP.
So the setup is that models are commoditizing on one side, and on the other side a product whose core value add (more intelligence, faster) is aimed at a constraint that was never really binding. This of course a rough combo for a trillion dollar capex supercycle.
Fun for the trade, while it lasts, but there is no thesis. Just dont tell CNBC and short NVDA on time ,-)
There's also a very strong Trurl and Klapaucius [1] component to this AI craziness, as in I remember a passage in Lem's The Cyberiad where either Trurl or Klapaucius were "discussing" with an intelligent/AGI robot and asking it for stuff-to-know/information, at which point said AGI robot started literally inundating them with information, paper on top of paper on top of paper of information. At that point it doesn't even matter if that information is correct or smart or whatever, because by that point the very amount of said information has changed everything into a futile endeavour.
Granted LLM's are not even PHDs.
What a weird time we live in...
Exactly. We don't use the intelligence we already have! That seems to be the real problem with the "AGI" concept. Given such a capability, we'll just nerf it, gatekeep it, and/or bias it. There's no reason to think we'll actually use it to benefit humanity as a whole. It will be shaped into an instrument to enforce our prejudices.
I have seen this argument made a lot, but llm serving being a commodity makes it _better_ for them not worse.
If it's a commodity, then you are entirely competing on price, and the players that will win on price will be the largest ones, because they can find efficiencies that smaller competitors won't have.
It's actually the small LLM companies that are in trouble if LLM serving commoditizes. They will need to distinguish themselves on features, because they can't compete on price. And even there the big labs will have an advantage.
As the US sold weapons to many nations in the past, so will China, the US, France, etc sell AI cyber capability to other nations. Likely every modern nation will need some datacenter to host a cluster of the preferred vendor, as nobody's going to trust the US or China with their security.
> Eventually it will become hard to justify the premium on these models.
On the contrary, the model is the moat.
The model represents embodied capital expenditure in the form of training. Training is not free, and it is not a commodity, it is heavily influence by curation.
Eventually the ever-increasing training expense will reduce the competition to 2-3 participants running cutting edge inference. Nobody else will be able to afford the chips, watts, and warehouse. It's a physics problem - not a lack of will.
If you're a retail user, and a lower-tier model is suitable for your work, you'll have commodity LLM's to help you. Deprecated models running on tired silicon. Corporate surveillance and ad-injection.
But if you're working on high-stakes problems in real time, you're going to want the best money can buy, so you'll concentrate your spend on the cutting-edge products, open API's, a suite of performance monitoring tools and on-the-fly engineering support. And since the cutting edge is highly sought after, it's a seller's market. The cutting edge products buoyed by institutional spend will pull away from the pack. Their performance will far exceed what you're using, because your work isn't important. Hockey stick curve. Haves and Have-Nots.
The economic reality is predetermined by today's physical constraints - paradigm shifting breakthroughs in quantum computing and superconductors could change the calculus but, like atomic fusion power, don't count on it being soon.
it will be interesting to see it unfold
We can all see the vast gulf between paid + open AI in image and video, it's really visible. Compare Grok to wan or LTX or whatever and the difference is vast. There is no debate that those sort of models are 3 or 4 generations behind, because you can't argue with your eyes.
But DIYers like you claim that text LLMs are up to scratch with the frontier models?
Again, I simply don't believe you. I can't be bothered to download like however many GB it is to find out, because the result is going to be completely underwhelming and going back to 2023.
And worse, when these 'open' models do start getting good, what makes you think these companies will carry on open sourcing their models?
At the moment they're trying to stay relevant, get investment. When these models do start getting good, they won't give away the weights, they'll sell them.
They're not actually open.
And then in a year or two your 'open' model will be horrifically out-of-date with completely out of date knowledge, because you can't add to the knowledge of the model, it's stuck at whatever date the data it was trained on finished.
So in a year or two, those models will be worthless. That's why Ali, Meta, etc. are giving them away.
I am waiting for that. Perhaps a taalas kind of high-performance custom hw coding llm engine paired with an open-source coding-agent. Priced like a high-end graphics card which would be pay off over time. It will be a replay of the ibm-mainframe to PC transition of a previous era.
Same, and I think we're close. "The original 1984 128k Mac model was $2,495, and the 1985 512k Mac was $2,795" [1]. That's $8 to 9 thousand today. About the price of a 32-core, 80-GPU M3 Ultra Mac Studio with 256 GB RAM.
[1] https://blog.codinghorror.com/a-lesson-in-apple-economics/
Anthropic gets access to limited compute resources and Amazon gets demand to justify increased R&D and capex + feedback from the best users in the field.
In my reading, Amazon is giving $5B of usage credits in exchange for shares. If Anthropic works out, it's a good deal for Amazon. If it doesn't, they lose on their invesment sheet, but they got ~ $5B in revenue, so it looks good on their operating sheet. And it helped justify a build out that they can sell to others.
For Anthropic, this lets them operate for more time without having to make numbers work. If Anthropic works out, they'll figure out the $100B commitment later. If it doesn't work out, it's not their problem.
It's probably faster to build up amazon's capacity with amazon's money than to build owned capacity with someone else's money at the scale they're looking to build out.
> Today’s agreement will quickly expand our available capacity, delivering meaningful compute in the next three months and nearly 1GW in total before the end of the year.
They need a bunch of compute, now.
so basically ...
you could view this as a kind of discount, but instead of paying less later, you get some cash now and then pay full later.
"Claude I'm evaluating whether I should host my app on AWS or Google Cloud. Provide me with an analysis on my options." "After a detailed analysis, AWS is clearly your better option."
If I am correct (and I hope that I am wrong!) this will drastically increase the cost of building these new data centers.
I hope that they find a way to forward, because personally im very passionate about AI, and in my opinion if used right its the future.
Allthough one thing i cant seem to find, maybe im havent searched enough, but what is the profit of anthropic?
Yeah, totally not desperately seeking investment to keep the party going ...
Gemma4 being able to run on commodity hardware I think is the real win out of this. Pop the bubble. Settle the craziness and the claws. Let scientists and engineers tinker and improve in the background. Hopefully we can have GPUs be affordable for gaming again although I'm starting to think that will never happen.
My mistake for believing it was law, it must have been some compliance corporate training mentioning it wasn't tolerated.
Perversely, it appears that the market will remain rational longer than they can remain solvent :-)
I think when they rack up the RAM prices, they should pay for the damage they caused here. I don't need AI anywhere, but the increase in RAM prices is annoying me. Thankfully I purchased new RAM for a new computer, say, 3 years ago, so I can hold out for the most part - but sooner or later I have to purchase a new computer, and I really don't see why I should pay more, solely due to AI companies and greedy hardware manufacturers. Simple-minded capitalism does not work - I consider this a racket as well as collusion.