Artificial Intelligence thread

Michael90

Senior Member
Registered Member

Chinese models have now overtaken US ones on openrouter
Im
Even surprised US closed paid models were so dominant compared to Chinese open sourced mostly free/cheap models. I don’t get why many people favored US MODELS so much . Afterall , who wouldn’t prefer something that’s almost free compared to something you have to pay a lot for?
 

tokenanalyst

Lieutenant General
Registered Member
Im
Even surprised US closed paid models were so dominant compared to Chinese open sourced mostly free/cheap models. I don’t get why many people favored US MODELS so much . Afterall , who wouldn’t prefer something that’s almost free compared to something you have to pay a lot for?
Is not that open router is free. Even a company running the local models in their own AI servers is expending money in electricity and expensive depreciating hardware. The problem is if the thing you want to replace is costing more than the thing you are replacing there is not point on it. What companies seen to be doing is offloading tasks to less expensive models.

Most Chinese models are MoE and they are made to run more efficient while US models are dense on purpose to keep an "IQ" edge against non US MoE models but that is coming an a increasing price tag that cannot be subsidize forever if these companies want to go public.

What is interesting is that this seem to be an unexpected blowback of US export controls. I said back then that Chinese models were going to focus on efficiency and architecture rather than brute computational power. My guess if that if Chinese companies had access to unrestricted Nvidia GPUs they would had gone dense and they would cost as much as US models.
 

tamsen_ikard

Captain
Registered Member
Is not that open router is free. Even a company running the local models in their own AI servers is expending money in electricity and expensive depreciating hardware. The problem is if the thing you want to replace is costing more than the thing you are replacing there is not point on it. What companies seen to be doing is offloading tasks to less expensive models.

Most Chinese models are MoE and they are made to run more efficient while US models are dense on purpose to keep an "IQ" edge against non US MoE models but that is coming an a increasing price tag that cannot be subsidize forever if these companies want to go public.

What is interesting is that this seem to be an unexpected blowback of US export controls. I said back then that Chinese models were going to focus on efficiency and architecture rather than brute computational power. My guess if that if Chinese companies had access to unrestricted Nvidia GPUs they would had gone dense and they would cost as much as US models.
There is no shortage of chips in China. This is actually wrong thinking by the mainstream.

China can get any number nvdia chip it wants through 3rd party countries. Smuggling, reselling, cloud renting all is available.

What Chinese companies lack is money. Chinese companies already do not have that much money compared to US companies. Plus China is much less AGI pilled and much more cautious on spending on AI. Finally, Chinese VC also lack unlimited funding that US VCs have.

When Chinese companies talk about chip shortage, they dont talk about lacking chips due to US sanctions
But lacking Chips due to much lower budget.
 

Kalum Pupeter

Junior Member
Registered Member

Chinese models have now overtaken US ones on openrouter
?
Please, Log in or Register to view URLs content!

mKZOWIa.png

m0xIWz3.png
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
?
Please, Log in or Register to view URLs content!

mKZOWIa.png

m0xIWz3.png
It really depends on what you are trying to evaluate.

For typical work load, the Chinese open source models are as good as the closed source models.

For other stuff, there are a bunch of stuff they can't do as well.

I'm not sure why you need to follow AI nerds online to figure this out.

Keep in mind that Chinese open source models are cheaper because they are generally smaller and they are mostly MoE models, which are not as good as dense models of the same size. It's a conscious approach just due to the funding issues they have and also what Chinese customers are willing to pay.

Opus 4.6-4.8 themselves are also not that good once you run the nerf'd version that don't use as many tokens (due to the current issues Anthropic has with compute). But these AI losers will never come out to admitting to that.
 

iewgnem

Captain
Registered Member
Im
Even surprised US closed paid models were so dominant compared to Chinese open sourced mostly free/cheap models. I don’t get why many people favored US MODELS so much . Afterall , who wouldn’t prefer something that’s almost free compared to something you have to pay a lot for?
This is OpenRouter metric, only people who don't want to use official API (and don't self host) use OpenRouter, so you're already looking at a small subset of western users who're fine with massively overpaying for open models just to host them on a western based server.
 

iewgnem

Captain
Registered Member
?
Please, Log in or Register to view URLs content!

mKZOWIa.png

m0xIWz3.png
This again? The bench that claims DS cost $4 for 50k tokens? Yeah we know western models are desperate to keep their scam up, but they could use more effort when faking their benches.

Or maybe they used their own models to fake the bench, which would make it even more ironic.

Please, Log in or Register to view URLs content!

TL;DR​

  • Cost inflated ~5×: The benchmark bills all input tokens at the full cache-miss rate ($0.435/M). In reality, 78% of tokens in agent runs are cache hits, which DeepSeek charges at $0.003625/M (99.2% discount). A representative trial reported at $4.36 drops to ~$0.89 with proper cache pricing on the verifiable portion. An additional $0.41 in the reported cost is unexplained (could be reasoning tokens, OpenRouter markup, or both) — but the dominant error is the cache-pricing gap. The $4.22 leaderboard average is similarly inflated.
  • We solved all three tasks they failed: Same model (deepseek-v4-pro), same task definitions, same test verifiers. Three tasks, three passes. Combined cache-adjusted API cost: ~$0.86 total ($0.37 bandit + $0.17 termenv + ~$0.31 superjson estimated). For context, DeepSWE reports $4.22/task for this model.
  • OpenRouter privacy guardrail blocks DeepSeek by default: OpenRouter hides providers that may train on data. Without explicitly enabling DeepSeek in privacy settings, the API returns 404s. DeepSWE has no failsafe for this — related to issue
    Please, Log in or Register to view URLs content!
    . We reproduced the 404 loop.
  • No effort tuning for DeepSeek: deepseek-v4-pro ran at "default" effort (reasoning_effort: null). Every other model on the leaderboard got tuned effort levels (xhigh, max, high, medium). Meanwhile thinking mode was ON by default, burning reasoning tokens at output rates without any configuration.
  • We had zero verifier infrastructure failures: We ran tests directly on the host (no Docker). None of the issues documented in
    Please, Log in or Register to view URLs content!
    (browser timeouts, Go dependency failures) affected our runs.
 
Top