Artificial Intelligence thread

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Yes, I realize Kling is not a LLM, which will be problematic for Kuaishou as the new breed of multi-modal models will be capable of substantially better details control & instructions following. Seed is more promising in this respect, and I hope Byte Dance can scale up on the LLM side, to the level of challenging Veo 3, which is the current favorite to "win" the closed AI video race.
For video and image creation, Kling does the job quite well. There are others that do the job of watching video and getting audio out and such. But if you are just looking at creating ads, short dramas and things like that, Kling serves the Chinese market quite well.

Remember, there are many Chinese models that are simply not being benchmarked by these 3rd party lists.
Multi-modal models are useful outside of edge devices.

I work in a content development industry. Multi-modal models are extremely disruptive for content generation. A model cross-trained between videos & text is not just useful for video recognition, it's also useful for generating videos. That's typically done on compute clusters.

The advantage over a model pipeline is that the video generator has a deeper & richer understanding of the association between words, objects, and motion, and can follow instructions much, much better than a typical video model that's only been trained on tags & descriptions. Thus, for instance, you can tell a multi-modal model to modify small details of a picture it generated, which is impossible for a tags-only image model like Stable Diffusion, where you'd have to do manual in painting.

There are also synergies when doing cross-training - e.g. LLMs' visuo-spatial knowledge are improved by reasoning in a latent space that's been trained on images & videos. Even Claude, which has been hyper optimized for logic & coding, has native vision capabilities because of this.

Any way, I'm not saying the future is necessarily native multi-modal models, but that it is a weakness in the Chinese AI ecosystem. Yes, Byte Dance has Seed and Alibaba has Qwen, but neither are built off of state of the art LLMs. Consequently, when folks in my industry are looking at options for enterprise quality content production, they're gravitating towards Google's and Open AI's solutions, even those that formerly preferred Kling.
reasoning frankly is overrated.

In most AI applications, you don't have the time to generate a bunch of reasoning tokens.

Think about actual multi-modal applications like ADAS, autonomous delivery robots, drones, humanoid robot and things like that. America is quite far behind in these areas. None of that can run computer clusters.

people in most of AI community in America has 0 clue what it's like to put AI in physical objects that do stuff. I guarantee you very few of the talking heads who follow LLMs understand this stuff.
 

Michael90

Junior Member
Registered Member
Hence the need for more strong Chinese competitors in this space, as well as replacing Baidu with a more competitive search engine
I dont think there's any other viable competitor from china in search . Baidu was Chinas best chance, but they messed up.
 

tanino

New Member
Registered Member
Hello everyone. I read the (excellent) analysis here that clearly explained the advantage of Mandarin language semantics over languages without idioms (e.g., Latin languages) in AI. I would like to create some infographics and publish them here for the entire forum to use and enjoy. However, I would need a summary outline. Could you help me? Thank you all.
 

StraightEdge

Junior Member
Registered Member
Beijing's policy push is amplifying demand. The "High-Quality AI Compute Infrastructure Plan" sets a 2025 target of 300 EFLOPS, with at least 35% allocated to intelligent computing. Cities like Shanghai are also mandating that new AI centers use over 50% domestic chips.
China's AI infrastructure buildout is accelerating. Yicai reports compute server shipments surged 97.3% in 2024 and will grow another 52.9% in 2025. As of May 26, 2025, 123 AI center contracts had been awarded — 2.2 times more than the same period last year, with full-year tenders expected to hit 213, up from 53.
domestic chips could exceed 40% market share by mid-year, up from about 30% in 2023 — a surge that would have been unimaginable just two years ago.
 
Top