You should read the update, he made a mistake in his evaluation of the benchmark. Deep Seek is unlikely to be much better than 10%. It is a fact that Western frontier models are superior on difficult tasks from a raw capability perspective; the question is whether the costs of running them are worth the value being delivered as the vast majority of productivity gains aren’t from the hardest problems & productivity gains don’t necessarily translate to higher $$$ either in the software industry.This again? The bench that claims DS cost $4 for 50k tokens? Yeah we know western models are desperate to keep their scam up, but they could use more effort when faking their benches.
Or maybe they used their own models to fake the bench, which would make it even more ironic.
There is an impending crisis in the agentic AI industry, but it is not because Chinese frontier models are challenging Western frontier models on raw capability. It is because Western hyper scalers and their investors are being absolutely taken for a ride by hardware makers, and the return on value on raw capability is rapidly diminishing. Expect to see a shift, soon, to more "efficient" and "cost effective" models on the Western side, as well; then we will see if China can truly squeeze them out of the game.
Last edited: