Artificial Intelligence thread

9dashline

Major
Registered Member
Why do you think they made the cost so high? I guess it’s because they are confident it’s powerful enough compared to anything china and the world has (excluding Claude ). Plus they are still relatively cheaper than Claude frontier model, so I think they will be okay. At some point it won’t be sustainable to keep making all your frontier models open source for free. There will come a time where Chinese models will have to also start focusing on monetization.
zhipu just seems to be ahead of the curve in this aspect. Though I agree they should have made it a it less expensive to attract more users first and establish themselves like Claude did in enterprise environment.
All the US infra providers like fireworks.ai, deepinfra, novita, etc are charging the same rates

Z.ai priced it where they can all make some profit and comfortable margins and thus have incentive to host Glm5.2

If he priced it too low it would be a race to bottom where Z.ai cannot sustain all the demand

Plus they want quality data to feed their flywheel.. so its a self selection filtering mechanisn as well...

With the USG capping Fable from above and Chinese openweights catching up from below... the US basically lost the AI race to China
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Alibaba has a good chance as well.

Qwen has been getting monthly releases, especially Qwen Max
Qwen devs have stated they are focusing to achieve SOTA instead of their previous open source focus
Qwen Max 3.7 was a big upgrade from 3.6 max in just one month and surprisingly strong in reasoning tasks but still a bit behind in agent/coding

They should be releasing a model next week given their previous history. There's also a rumor of Qwen Max 4.0 releasing in the summer.
Qwen 3.7 is like GLM-5.1 in that it was first Qwen model to look like it has some RSI potential but the problem for alibaba team is whether or not they got good enough data since they didn’t open things up and make it as cheap and accessible to developers like Kimi and glm did.
 

meedicx

Junior Member
Registered Member
Qwen 3.7 is like GLM-5.1 in that it was first Qwen model to look like it has some RSI potential but the problem for alibaba team is whether or not they got good enough data since they didn’t open things up and make it as cheap and accessible to developers like Kimi and glm did.

I'm skeptical about the user data argument. People were saying this since ChatGPT-3.5 that the model with the most users will snowball due to accumulating a data advantage, but this turned out to be false. There are major filtering, distribution and data quality challenges with using user provided data for training. It's also unlikely for Z.AI to have acquired that much user data to make such a big leap in a few months.

It seems to me the breakthrough for GLM-5.2 is changing how they do RL, especially by focusing on long-horizon tasks. The technical blog mentions switching RL reward algorithms from GRPO to critic-based PPO. DeepSeek R1 saw a major breakthrough using GRPO to enable reasoning and GLM-5.2 may have achieved a similar-level breakthrough using critic-based PPO for long-horizon task training.

The Qwen3.7 Max blog also put a strong emphasis on RL for long-horizon tasks and describes overcoming similar challenges in preventing reward-hacking. So they are already on a similar track as GLM.

By being open weight and transparent about RL processes, GLM-5.2 will lift every other LLM developer just like DeepSeek R1. I expect all the major Chinese LLMs to massively improve by end of this year as they distill GLM-5.2 and adopt its RL techniques
 
Last edited:
Top