The Extreme Cost of Training AI Models.

101 · 3 months ago

The Extreme Cost of Training AI Models.

@breadsmasher · 3 months ago

The source didn’t have this detail - google training gemini “cloud” vs “own hardware”. Does Google Cloud not count as “own hardware” for google?

bjorney · 3 months ago

Does Google Cloud not count as “own hardware” for google?

That’s why the bars are so different. The “cloud” price is MSRP

@[email protected] · edit-2 3 months ago

This is an accounting trick as well, a way to shed profit, and maximize deductions, by having different units within a parent company purchase services from each other.

I realize that my sentence long explainer doesn’t shed any light on how it gets done, but funnily enough, you can ask an LLM for an explainer and I bet it’d give a mostly accurate response.

Edit: Fuck it, I asked an LLM myself and just converted my first sentence into a prompt, by asking what that was called, and how it’s done. Here’s the reply:

This practice is commonly referred to as “transfer pricing.” Transfer pricing involves the pricing of goods, services, and intangible assets that are transferred between related parties, such as a parent company and its subsidiaries.

Transfer pricing can be used to shift profits from one subsidiary to another, often to minimize taxes or maximize deductions. This can be done by setting prices for goods and services that are not at arm’s length, meaning they are not the same prices that would be charged to unrelated parties.

For example, a parent company might have a subsidiary in a low-tax country purchase goods from another subsidiary in a high-tax country at an artificially low price. This would reduce the profits of the high-tax subsidiary and increase the profits of the low-tax subsidiary, resulting in lower overall taxes.

However, it’s worth noting that transfer pricing must be done in accordance with the arm’s length principle, which requires that the prices charged between related parties be the same as those that would be charged to unrelated parties. Many countries have laws and regulations in place to prevent abusive transfer pricing practices and ensure that companies pay their fair share of taxes.

@General_Effort · 3 months ago

From the source:

Our primary approach calculates training costs based on hardware depreciation and energy consumption over the duration of model training. Hardware costs include AI accelerator chips (GPUs or TPUs), servers, and interconnection hardware. We use either disclosures from the developer or credible third-party reporting to identify or estimate the hardware type and quantity and training run duration for a given model. We also estimate the energy consumption of the hardware during the final training run of each model.

As an alternative approach, we also calculate the cost to train these models in the cloud using rented hardware. This method is very simple to calculate because cloud providers charge a flat rate per chip-hour, and energy and interconnection costs are factored into the prices. However, it overestimates the cost of many frontier models, which are often trained on hardware owned by the developer rather than on rented cloud hardware.

https://epochai.org/blog/how-much-does-it-cost-to-train-frontier-ai-models

mox · 3 months ago

I don’t care how they estimate their cost in dollars. I think the cost to all of us in environmental impact would be more interesting.

@UnderpantsWeevil · edit-2 3 months ago

Unless they’re finding exciting new and efficient ways to generate electricity, I imagine its a linear comparison. Maybe some are worse than others. I know Grok’s datacenter in Mississippi is relying exclusively on portable gas powered electric generators that are wrecking havoc on the local environment.

downhomechunk · 3 months ago

Gas like natural gas? Or gas like gasoline? I’m sure it’s the former, but I take nothing for granted anymore.

@UnderpantsWeevil · 3 months ago

https://www.npr.org/2024/09/11/nx-s1-5088134/elon-musk-ai-xai-supercomputer-memphis-pollution

Methane gas engines

@[email protected] · 3 months ago

Methane gas isn’t a fossil fuel though, and I believe it’s actually better for the environment to burn it than simply release it, at least as far as global warming goes.

@UnderpantsWeevil · 3 months ago

Methane gas isn’t a fossil fuel though

It’s a primary byproduct of Y-Grade gas during fractionation. But it is also less energy dense than your pricier fuels and and lighter. If you’re not using good compression you might as well be venting the fuel as fast as you burn it.

@[email protected] · 3 months ago

Is it? I thought they were burning landfill or swamp gas.

@UnderpantsWeevil · 3 months ago

You can get it there, too, but when it’s already mixed with air you’re forced to do the math of how much energy is in the methane versus how much it costs to distill out of the nitrogen and oxygen.

mox · 3 months ago

I didn’t know that; thanks for sharing.

(BTW, I think you meant wreaking havoc.)

@UnderpantsWeevil · 3 months ago

All my misspellings are part of my charm.

@linearchaos · 3 months ago

Maybe this is the push we need to switch to nuclear. The attack is good it just needs somebody with deeper pockets than coal/gas to lobby it.

@UnderpantsWeevil · 3 months ago

Microsoft is trying to restart Three Mile Island. But that’s a very old facility. I don’t see too much interest in building new ones.

@linearchaos · 3 months ago

Kind of. Microsoft is offering to buy the electricity and put jobs and data centers nearby, the state is reactivating the site.

If more AI companies dedicate to buying vast amounts of electricity, there’s money and jobs in it

But if they eye companies start making concentrated demand, It won’t people with deep pockets long to figure out how to turn up some small scale high output plants.

@UnderpantsWeevil · 3 months ago

If more AI companies dedicate to buying vast amounts of electricity, there’s money and jobs in it

Google the history of the Vogtle 3 and 4 reactors in Georgia. I don’t think tech firms have 16 years to invest in new energy plants.

@[email protected] · 3 months ago

Honestly you can thank decades of anti-nuclear lobbying

@UnderpantsWeevil · 3 months ago

More the plunge in O&G prices during the 1980s. Coal, oil, and natural gas got incredibly cheap under Reagan after the US cut sweetheart deals with the Saudis. Nuclear has huge upfront development costs, while oil, gas, and coal are very cheap to start up and run incredibly high margins.

Lobbying and activism had very little impact, as evidenced by the campaigns against coal waste and gas flaring and strip mining that all fell flat.

@[email protected] · 3 months ago

I want to see what the long term economic cost was after they fired tens of thousands of tech workers hoping to replace us with AI. It feels like workers are always the ones who suffer the most under capitalism.

@linearchaos · 3 months ago

They’ll fire more than that when the AI bubble busts and they stop pushing so hard into that development as it stagnates.

@[email protected] · 3 months ago

It depends if they fire them and AI can’t actually do the job, then it would suck.

If they are fired and the ai can do it, then it’s great, it’s like having that many new people.

@[email protected] · 3 months ago

Assume it is equivalent to burning 200 million $ of gasoline

@[email protected] · 3 months ago

Considering the hype and publicity GPT-4 produced, I don’t think this is actually a crazy amount of money to spend.

oce 🐆 · edit-2 3 months ago

Yeah, I’m surprised at how low that is, a software engineer in a developed country is about 100k USD per year.
So 40M USD for training ChatGPT 4 is the cost of 400 engineers for one year.
They say cost of salaries could make up to 50% of the total, so the total cost is 800 engineers for one year.
That doesn’t seem extreme.

my_hat_stinks · 3 months ago

100k USD per engineer assumes they’re exclusively hiring from US and Switzerland, that’s not a general “developed country” thing. US is an outlier.

@[email protected] · 3 months ago

US and Switzerland are way over 100k. For Netherlands and Germany 100k is a good approximation for the company costs for a senior SWE.

my_hat_stinks · edit-2 3 months ago

I did already back up the claim with a source, but okay:

US: Senior 128k USD, mid-level 94k USD
CH: Senior 118k CHF (~139k USD), mid-level 95k CHF (~112k USD)
DE: Senior 72k EUR (~80k USD), mid-level 58k EUR (~65k USD)
NL: Senior 69k EUR (~77k USD), mid-level 52k EUR (~58k USD)

Yes, US and Switzerland are outliers.

@[email protected] · 3 months ago

Yeah, 80k gross for the worker creates close to 100k costs for the employer.

oce 🐆 · 3 months ago

I’m talking about the cost of the engineer for the company, not the salary, which is less relevant here. In some EU countries, the salaries may be lower, but the taxes are higher to pay for the social system, so the cost for the company is similar.

@General_Effort · 3 months ago

Yes. Also, Europeans work fewer hours per year. There are big differences between EU countries, though. https://en.wikipedia.org/wiki/List_of_countries_by_average_annual_labor_hours

@jacksilver · 3 months ago

This is just the estimates to train the model, so it’s not accounting for the cost to develop the system for training, collecting the data, etc. This is just pure processing cost, which is staggeringly large numbers.

@[email protected] · 3 months ago

Comparitively speaking, a lot less hype than their earlier models produced. Hardcore techies care about incremental improvements, but the average user does not. If you try to describe to the average user what is “new” about GPT-4, other than “It fucks up less”, you’ve basically got nothing.

And it’s going to carry on like this. New models are going to get exponentially more expensive to train, while producing less and less consumer interest each time, because “Holy crap look at this brand new technology” will always be more exciting than “In our comparitive testing version 7 is 9.6% more accurate than version 6.”

And for all the hype, the actual revenue just isn’t there. OpenAI are bleeding around $5-10bn (yes, with a b) per year. They’re currently trying to raise around $11bn in new funding just to keep the lights on. It costs far more to operate these models (even at the steeply discounted compute costs Microsoft are giving them) than anyone is actually willing to pay to use them. Corporate clients don’t find them reliable or adaptable enough to actually replace human employees, and regular consumers think they’re cool, but in a “nice to have” kind of way. They’re not essential enough a product to pay big money for, but they can only be run profitably by charging big money.

@[email protected] · 3 months ago

The latest releases ChatGPT 4o costs $600/hr per instance to run based on the discussion I could find about it.

If OpenAI is running 1k of those models to service the demand (they’re certainly running more since queries can take 30+ seconds) then that’s 200M/yr just keeping the lights on.

@[email protected] · 3 months ago

That’s a lot, but what’s their revenue?

@[email protected] · 3 months ago

3.4bn is their gross - we have no idea what their operating costs are since they refuse to share them.

Some estimates say they’re burning 8 billion a year.

@linearchaos · 3 months ago

How in the hell is Gemini both two and a half times more expensive and vastly inferior to GPT?

@ZILtoid1991 · 3 months ago

Some claim due to it was trained on too much data with too little intervention

@postmateDumbass · 3 months ago

Maybe we donnot understand what its objective function actually wants?

Maybe it is impeding its users intentionally.

@PixeIOrange · 3 months ago

Google sucks

KillingTimeItself · 3 months ago

bro who the fuck is google paying to do cloud compute for them? Google cloud??

@ripcord · edit-2 3 months ago

I assume they’ve come up with some generic cost if someone was training each model using cloud compute.

Eeit: below comments confirm this, from the source.

KillingTimeItself · 3 months ago

god i love accounting, it’s so much fun.

@ripcord · 3 months ago

But this isn’t accounting, this is just the way the study calculated stuff.

@postmateDumbass · 3 months ago

Lets make our model sound cooler by paying high rates to ourselves!

@ripcord · edit-2 3 months ago

Man you and the other dude are trying way too hard to be outraged about something that doesn’t exist here.

This isn’t data that Google, etc claimed. The srudy is attempting to represent what they believe the financial coat to train these models would have been.

@Wispy2891 · 3 months ago

It’s obvious that Google didn’t pay the crazy AWS prices to train Gemini, seeing how many servers they have in gcp.

They mean that they used creative accounting to pay themselves crazy gcp usage bills to deduct from taxes?

@FinishingDutch · 3 months ago

Geez, you’d think Gemini would be better than it is if they spent that much on it…

@kromem · 3 months ago

Base model =/= Corpo fine tune

@[email protected] · 3 months ago

and gemini is still hot ass

@[email protected] · 3 months ago

trueee

@pyre · 3 months ago

because this entire model of AI as an idea is garbage to begin with

@IndustryStandard · edit-2 3 months ago

Only 80 million dollars for gpt4? Cheaper than expected

@[email protected] · 3 months ago

The AI industry could stop right there, we won the jackpot already. They just need to stop while they’re ahead ! It is very unlikely that we will have as much as 1/10 the leap we have already seen.

@hark · 3 months ago

Now imagine if they had to pay for the content they’re training the models off of.

@AbouBenAdhem · edit-2 3 months ago

How is Inflection-2 cheaper to train in the cloud than own hardware?

@General_Effort · 3 months ago

That probably indicates a problem with the estimates.

@[email protected] · 3 months ago

All that shit needs to be just down and not revisited again.

Todd Bonzalez · edit-2 9 days ago

deleted by creator

@[email protected] · 3 months ago

“It cost a lot, so it absolutely should be allowed!”

Is an even dumber excuse to keep it going.

@[email protected] · 3 months ago

Humanity: develops nuclear fusion

AI:

@IndustryStandard · 3 months ago

It’s like the south park “Now we can finally play the game” but for AI. First we get infinite energy and then we can train an AI to calculate how we can create infinite energy.

@DaddysLittleSlut · 3 months ago

We must consider the benefits of AI as such and how they can contribute to our life. I can assure you prices of such while AI may seem like a game or useless thing for others. It’s actually a useful tool able to help others understand complex concepts that most people have a hard time explaining or won’t. Many more things too.

@[email protected] · 3 months ago

If we assume this is already as good as it’s going to get and we don’t throw another 7 trillion into that fire.

For 100 million, a open source openweight release of gpt4 into the public domain will have been a good deal and releasing it into the public domain and preventing enclosure of our intellectual commons would make the enterprise as a whole a worthwhile endeavor.

@_sideffect · 3 months ago

All a huge waste of money.

This isn’t ai.

It’s a “Smarter Search”

Pennomi · edit-2 3 months ago

AI is a broader term than you might realize. Historically even mundane algorithms like A* pathfinding were considered AI.

Turns out people like to constantly redefine artificial intelligence to “whatever a computer can’t quite do yet.”

@_sideffect · 3 months ago

No.

What I’m saying is what all these companies are presenting us is a smarter search.

It’s just a tighter grouping of (biased) data that can be searched and retrieved a bit quicker.

If they want to use the term ai, then hell, factory machines from the last century are ai too.

@smooth_tea · 3 months ago

It’s just a tighter grouping of (biased) data that can be searched and retrieved a bit quicker.

How is your intelligence different from being “biased data that can be accessed”?

The fact that something can reason about what it presents to you as information is a form of intelligence. And while this discussion is impossible without defining “reason”, I think we should at least agree that when a machine can explain to you what and why it did what it did, it is a form of reason.

Should we also not define what it means when a person answers a question through reasoning? It’s easy to overestimate the complexity of it because of our personal bias and our ability to fantasize about endless possibilities, but if you break our abilities down, they might be the result of nothing but a large dataset combined with a simple algorithm.

It’s easy to handwave the intelligence of an AI, not because it isn’t intelligent, but because it has no desires, and therefore doesn’t act unless acted upon. It is not easy to jive that concept with the idea that something is alive, which is what we generally require before calling it intelligent.

@[email protected] · 3 months ago

It is AI though. AGI, which is a subcategory of AI and what many people seemingly imagine AI to mean, it’s not—but AI, yes.

SkaveRat · 3 months ago

I hope you complained all these years when games used “AI” for computer controlled enemies, because otherwise your post would be super awkward

@_sideffect · 3 months ago

Lmao, you have no idea what you’re saying.

Keep sucking up to these useless ai companies though, they love it!

SkaveRat · 3 months ago

sure, bud

@bitjunkie · 3 months ago

Something was needed, tradsearch has sucked dick at anything other than finding a wiki article for an extremely broad topic for over a decade. Just make electricity sustainably. 🤷‍♂️

@ZILtoid1991 · 3 months ago

Because it got enshittified, with SEO, ads, etc.

@[email protected] · 3 months ago

I also think, that search algorithms work fine, as long as noone is actively trying to fill your results with trash.