Artificial Intelligence thread

Eventine

Senior Member
Registered Member
This again? The bench that claims DS cost $4 for 50k tokens? Yeah we know western models are desperate to keep their scam up, but they could use more effort when faking their benches.

Or maybe they used their own models to fake the bench, which would make it even more ironic.

Please, Log in or Register to view URLs content!
You should read the update, he made a mistake in his evaluation of the benchmark. Deep Seek is unlikely to be much better than 10%. It is a fact that Western frontier models are superior on difficult tasks from a raw capability perspective; the question is whether the costs of running them are worth the value being delivered as the vast majority of productivity gains aren’t from the hardest problems & productivity gains don’t necessarily translate to higher $$$ either in the software industry.

There is an impending crisis in the agentic AI industry, but it is not because Chinese frontier models are challenging Western frontier models on raw capability. It is because Western hyper scalers and their investors are being absolutely taken for a ride by hardware makers, and the return on value on raw capability is rapidly diminishing. Expect to see a shift, soon, to more "efficient" and "cost effective" models on the Western side, as well; then we will see if China can truly squeeze them out of the game.
 
Last edited:

iewgnem

Captain
Registered Member
You should read the update, he made a mistake in his evaluation of the benchmark. Deep Seek is unlikely to be much better than 10%. It is a fact that Western frontier models are superior on difficult tasks from a raw capability perspective; the question is whether the costs of running them are worth the value being delivered as the vast majority of productivity gains aren’t from the hardest problems & productivity gains don’t necessarily translate to higher $$$ either in the software industry.

There is an impending crisis in the agentic AI industry, but it is not because Chinese frontier models are challenging Western frontier models on raw capability. It is because Western hyper scalers and their investors are being absolutely taken for a ride by hardware makers, and the return on value on raw capability is rapidly diminishing. Expect to see a shift, soon, to more "efficient" and "cost effective" models on the Western side, as well; then we will see if China can truly squeeze them out of the game.
The update does not change the fact the bench set thinking level to null and faked pricing metric, which discredit the entire exercise.

I won't argue DS isn't the most capable model, I use Kimi when that's needed.

But the idea that western models are more capable on "difficult tasks" is an entirely made up scenario, because the nature of "difficult tasks" (which are mostly just one-shot tasks), is that by being difficult it's also difficult to manually fix or check, which means anything short of 100% success rate become practically useless, and they're not at 100% success.

Actual ability to solve difficult tasks require the ability to do things step by step, iterate, test, use tools and fix problems, all of which western model are inferior in at constant budget.

I know this directly because I do use them to solve difficult tasks involving large systems and I've been able to do so far better than if I were still capped by Claude usage.

The inability to actually solve difficult tasks outside of bench where you can immediately evaluate outcome and all you care is a % success is why western models are failing in enterprise use.
 

HighGround

Senior Member
Registered Member
There is an impending crisis in the agentic AI industry, but it is not because Chinese frontier models are challenging Western frontier models on raw capability. It is because Western hyper scalers and their investors are being absolutely taken for a ride by hardware makers, and the return on value on raw capability is rapidly diminishing. Expect to see a shift, soon, to more "efficient" and "cost effective" models on the Western side, as well; then we will see if China can truly squeeze them out of the game.
Nobody is forcing them to pay these exorbitant prices. Fact of the matter is, Western AI labs wanted more compute, as opposed to learning to do more with less. Which is fine, but that's what begets a bidding war. The underlying evil is that it's not even their money. Anthropic, OpenAI, Meta, and xAI all decided to pour gasoline on the pile of cash gifted to them by investors, all in the hopes of reaching the mythical AGI point that will, somehow, deliver them to the money printer.

At least Meta uses its own money, but the others all spend other people's money. Note the Google, Amazon, and Microsoft were all much more conservative and were actively pursuing cost reduction.

Anyway. Yeah people need to give it a rest. More compute and more tokens will produce better AI outputs. However, the Chinese models are actually viable as businesses. They will generate returns on CAPEX far faster than Anthropic or OpenAI. I have no idea how or why investors justified signing over money to a company that wants to spend a trillion dollars in CAPEX over the next few years. The only viable business strategy here is to simply sell your shares to some other sucker before the bubble collapses into itself.

AI uses all the same tactics that the Crypto/NFT bubble did a few years ago. It's a reality warping bubble over there.
 
Top