Artificial Intelligence thread

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member

interesting news today from Alibaba across the board on Qwen 3.7, the full AI hardware stack and its Bailian platform. Note the language here

在百炼推理平台,阿里巴巴构建了大规模GPU资源集群,并通过并池调度实现资源高效利用。
在此基础上,百炼通过上下文缓存消除重复计算,借助吞吐弹性调度机制应对流量波峰波谷与负载波动。在效果优化方面,百炼引入了Agentic RL,基于Agent执行反馈的强化学习机制,驱动模型持续迭代。此外,百炼内建了安全治理能力,确保自主运行的Agent始终不越界
It really emphasized its own GPU resource cluster which can be used efficiently and introduced Agentic RL which can likely be used in improving various models through post training. But no mention of DeepSeek in its selected partners. They included everyone else.
 

mossen

Senior Member
Registered Member
I've been harsh on Qwen's big models in the past because they were pretty bad, but 3.7 Max seems like a genuinely good release. Finally their big models is approaching the much better performance of their smaller models.

Their blogpost is up here:

Please, Log in or Register to view URLs content!

I look a lot at long-context and hallucination rates. Artificial Analysis does test for hallucination rates and we also have stand-alone benchmarks like Halluhard. For long context, Qwen 3.7-Max is best-in-class at MRCR-v2 (128k).

1.png
 
Top