DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts

Alphane Moon · edit-2 1 month ago

DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts

Alphane Moon · 1 month ago

While the full report requires a subscription, they do have a section titled “DeepSeek subsidized inference margins”.

This is from the intro to that section:

MLA is a key innovation responsible for a significant reduction in the inference price for DeepSeek. The reason is MLA reduces the amount of KV Cache required per query by about 93.3% versus standard attention. KV Cache is a memory mechanism in transformer models that stores data representing the context of the conversation, reducing unnecessary computation.

As discussed in our scaling laws article, KV Cache grows as the context of a conversation grows, and creates considerable memory constraints. Drastically decreasing the amount of KV Cache required per query decreases the amount of hardware needed per query, which decreases the cost. However we think DeepSeek is providing inference at cost to gain market share, and not actually making any money. Google Gemini Flash 2 Thinking remains cheaper, and Google is unlikely to be offering that at cost. MLA specifically caught the eyes of many leading US labs. MLA was released in DeepSeek V2, released in May 2024. DeepSeek has also enjoyed more efficiencies for inference workloads with the H20, due to higher memory bandwidth and capacity compared to the H100. They have also announced partnerships with Huawei but very little has been done with Ascend compute so far.

It seems that at least some LLM models from Google offer lower inference cost (while likely not being subsidized).

@anyhow2503 · 1 month ago

However we think

The times where I have trusted what tomshardware thinks are long gone.

Alphane Moon · 1 month ago

This is from the source report by SemiAnalysis, not from tomshardware.

@[email protected] · 1 month ago

All of the writers are long on NVDA, i don’t trust any analysis that doesn’t start with disclaiming that conflict of interest.

Also, this is time of my life I’ll never get back. They literally use semianalysis as the source for their references. Where are the outside references. There’s a reason self citation is frowned upon in science. Massive sour grapes energy from NVDA holders.

Alphane Moon · edit-2 1 month ago

I am not making any judgment call regarding SemiAnalysis or the validity of their report. I did say "It seems that at least some LLM models from Google offer lower inference cost (while likely not being subsidized).

@anyhow2503 · 1 month ago

Thank you for clarifying.