Artificial Intelligence thread

dripblackcoffee

New Member
Registered Member
However, this is essential for XAI.

Cursor 2 trained on the Kimi k2.5 offers significantly improved programming capabilities, meaning they have ample private data and post-training capabilities. While not perfect for XAI, this is still way superior to XAI's current state, which offers little beyond computing resources.
not really, composer 3 will be original according to them, they've basicly eaten xai and became elon's lab, for 2.5 they didnt hide the fact that they trained from another model, it was just them hiding what model its from. It would be interesting to bench their model against glm, both companies who primarily do code
 

dripblackcoffee

New Member
Registered Member
However, its actually important for Microsoft who is looking into deepseek, plenty of companies are looking to cut ai cost either through cohesion or replacement, and locking out cxmt is unrealistic because of the memory crunch. Have to say, locking out NIO has to be the most random thing on the entity list, truely a company that didnt do anything to get on there, not even the slightest argument in favor of it, not one that doesnt apply to every other manufacturer, Chinese, American, French...
 

iewgnem

Captain
Registered Member
not really, composer 3 will be original according to them, they've basicly eaten xai and became elon's lab, for 2.5 they didnt hide the fact that they trained from another model, it was just them hiding what model its from. It would be interesting to bench their model against glm, both companies who primarily do code
Yeah you're not gonna train a brand new model that's competitive with any Chinese lab if your entire expertise in model training is to fork Kimi 2.5 then try hiding it lol.

They're just going to do a better job hiding where they based Composer 3 on, which isn't even a bad idea since it's not like Chinese open models care.

I mean we're at a point where Chinese models are getting to Opus 4.8 while US just effectively banned development of any model better than 4.8, forking the latest Chinese models and hide your tracks is an extremely efficient way to catch up to Anthropic.
 

tokenanalyst

Lieutenant General
Registered Member

Moore Threads has taken the lead in completing the rapid adaptation of Zhipu GLM-5.2.​


Moore Threads has successfully completed a Day-0 rapid adaptation of Zhipu’s latest open-source flagship model, GLM-5.2, on its full-featured MTT S5000 AI training and inference GPU. Building on prior optimizations for long-context prefilling and P/D heterogeneous separation from the GLM-5.1 iteration, the technical team utilized the high-performance SGLang-MUSA inference engine and TileLang-MUSA operator programming language to swiftly execute model structure adaptation, key operator tuning, framework deployment, and verification. This achievement highlights the agility of domestic GPU infrastructure in supporting cutting-edge AI models while establishing a replicable engineering framework for complex hardware-software collaboration.

The MTT S5000 is specifically optimized to handle GLM-5.2’s demanding long-context capabilities, which include support for ultra-long 1M token contexts and stable processing of tasks spanning up to eight hours. Powered by native FP8 acceleration delivering up to 1,000 TFLOPS of dense computing power, the GPU features an 80GB memory capacity and 1.6TB/s bandwidth—critical assets during the computationally intensive long-input prefill stage. By leveraging toolchains like MUSA C++, Triton-MUSA, and TileLang-MUSA, Moore Threads has significantly reduced first-token wait times (TTFT) and enhanced inference efficiency for applications such as AI coding, RAG systems, and long-document analysis.

GLM-5.2 represents a major leap in open-source large models, excelling in long-horizon development scenarios and achieving top global rankings on the Code Arena platform for front-end and back-end coding tasks. To fully unlock these capabilities, Moore Threads implemented end-to-end optimizations that combine native operator customization with advanced scheduling techniques via SGLang-MUSA. These improvements boost inference throughput and lower response latency without compromising model accuracy, delivering robust performance for AI agent workflows, complex system engineering, and deep debugging applications.

Since the GLM-4.7 release, Moore Threads has maintained a track record of real-time adaptation to every iteration in Zhipu’s Smart Spectrum series. For GLM-5.2, this commitment extends beyond basic compatibility to comprehensive end-to-end support, including prefill optimization, multi-card scaling, KV cache transmission enhancements, and cluster-level TCO reduction. Looking ahead, Moore Threads will continue to leverage the expansive MUSA software ecosystem to rapidly integrate emerging model architectures, accelerating the deployment of high-performance, scalable domestic GPU infrastructure for next-generation AI applications.

Please, Log in or Register to view URLs content!
 

9dashline

Major
Registered Member
While waiting for my claude quota to reset for the week, tested out GLM-5.2 inside of z.ai's ZCODE harness, I would say its better than Opus 4.5 and 4.6 in Claude code for sure, probably catching up to Opus 4.8 and maybe on par with Opus 4.7

Currently Claude subscription at $200/mo is still overall cheaper since Im getting about 600 million tokens out of it per month when maxing out the 20x plan quota but its almost certain in the long run Anthropic will not be able to afford this sort of money losing loss lead and will cut down on it to start charging folks "real prices"...

The net effect of Anthropic starting to charge for "real prices" is people will go to wherever is cheapest, and by comparing API rates, of course Z.ai api is far cheaper than what Anthropic offers
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
I will say that for me K2.7 Code still does some really annoying stuff. I'd give it (imo) pretty clear instructions and it'd just get it wrong for several tries. if it wasn't for Kimi's great Chinese language research, I would've picked GLM, since it's coding is clearly better.
 
Top