Artificial Intelligence thread

Engineer

Major
A
Please, Log in or Register to view URLs content!
note that LLMs that follow a “Kepler-esque” approach: they can successfully predict the next position in a planet’s orbit, but fail to find the underlying explanation of Newton’s Law of Gravity (see
Please, Log in or Register to view URLs content!
). Instead, they resort to incorrect fitting rules that allow them to successfully predict the planet’s next orbital position but
Please, Log in or Register to view URLs content!
to find the force vector and generalize to other physics. Explained in
Please, Log in or Register to view URLs content!
.

A neat idea, but the authors oversimplified the concept of planetary motion. An excerpt from the paper:
For centuries, astronomers and physicists have worked on predicting the orbits of planets around the sun. A groundbreaking model was offered by the astronomer Johannes Kepler in the 17th century. His model was based on geometric patterns: for example, that the orbit of each planet followed an ellipse with the sun at one of its foci. While the model could predict orbits with a near-perfect level of precision, it couldn’t explain why the planets obeyed these geometric orbits or be applied to new problems beyond predicting trajectories.

Later, Isaac Newton expanded on this model using new laws of motion, now known as Newtonian mechanics. These laws involved computing properties of the set of planets in motion, such as their relative velocities and masses. Using these properties, he could derive Kepler’s earlier laws for orbital trajectories, but also go beyond, understanding and formalizing other concepts like force and gravity. From Kepler to Newton, scientists were able to move beyond good predictive models of sequences to a deeper understanding of them. In this section, we test whether a transformer that can predict sequences of orbital trajectories is merely a good sequence model, or whether it has also made the transition to providing a world model.

May be the authors didn't have enough space to devote to history, but that very first sentence should have given everyone a clue: If the concept were so simple, humanity wouldn't have spent centuries working on it! Those centuries should be more like millennia, because observation started way back in ancient time. Here is a video about that history:

The authors trained and used a 109M parameters model that did nothing except for predicting next planetary positions based on a sequence of observation. The authors' argument would have been a little more convincing had the model output velocity of the planets as well. With only positions and no mass data given, the valid course of prediction would be through the use of N-th order polynomials, and that's exactly what the model did.

Even with knowledge on Newton's Law of Gravity, N-th order polynomials are still used to describe motion of planets in real world application, such as NASA's DE Series. Newton's Law only works for two-body systems, which the solar system is not. So may be it wasn't the model that is stupid here.
 
Last edited:

Wrought

Senior Member
Registered Member
Paper on reduced stress in doctors using AI assistants for medical paperwork.

This multicenter quality improvement study found that use of an ambient AI scribe platform was associated with a significant reduction in burnout, cognitive task load, and time spent documenting, as well as the perception that it could improve patient access to care and increase attention on patient concerns in an ambulatory environment. These findings suggest that AI may help reduce administrative burdens for clinicians and allow more time for meaningful work and professional well-being.

Physicians, who are in short supply and high demand,
Please, Log in or Register to view URLs content!
spend more than half their workday documenting in the EHR,
Please, Log in or Register to view URLs content!
-
Please, Log in or Register to view URLs content!
and only a quarter of their time is spent face to face with patients.
Please, Log in or Register to view URLs content!
The proportion of time spent documenting continues to escalate,
Please, Log in or Register to view URLs content!
,
Please, Log in or Register to view URLs content!
especially for primary care professionals,
Please, Log in or Register to view URLs content!
-
Please, Log in or Register to view URLs content!
and is associated with burnout, reduction in work effort, and turnover.
Please, Log in or Register to view URLs content!
,
Please, Log in or Register to view URLs content!
-
Please, Log in or Register to view URLs content!


The National Academy of Medicine convened a meeting in December 2024 on the potential for AI to improve health worker well-being (eg, reduce burnout).
Please, Log in or Register to view URLs content!
To date, there are scant, mostly single-center data assessing whether this technology could reduce administrative burden, liberate time for patients, and reduce professional burnout.
Please, Log in or Register to view URLs content!

Please, Log in or Register to view URLs content!
 

Eventine

Senior Member
Registered Member
Some thoughts on the release of Sora 2 that's taken the world by storm.

With this release, I think it's fair to say that Open AI has taken the crown on video generation (closed source). Rather than post carefully curated Open AI demo videos, I'll instead link to these two videos which I think illustrates a capability that even Google's Veo 3 never reached:

Please, Log in or Register to view URLs content!

Please, Log in or Register to view URLs content!

The key here is that the videos are actually funny, and if you look up other examples of Sora 2's generations, you'll notice that it has an actual sense of scene composition and dramatic timing, which it combines with snappy generated dialogue, sound effects, and music to create an overall feeling that is more & more difficult to distinguish from human creators.

Of course, it's still got plenty of flaws, but if you compare it to the "robotic" videos from Veo 3, Kling, and Wan, it is plain that there's something different about Sora. Yes, Veo 3 was ground breaking for its ability to generate audio and video at the same time, but Sora 2 can actually direct.

If I had to guess, this is an intrinsically multi-modal model built off of a fine-tuned variant of GPT-5 trained for creative work, which combines the language and writing capabilities of LLMs with a video model trained off of however many works Open AI was able to steal from the internet and copy righted content (the fact the model understands celebrities, anime characters, etc. indicates it was trained on copy righted content).

The irony of the West contributing to its own social media waste land with more AI generated viral garbage aside, I reckon this places Western Big AI labs a full generation ahead of Chinese labs on video generation (where before, I would argue Veo 3 was about 0.5 generations ahead). It would seem that, for the time being, hyper scalers have been proven right about something, since the most dominant video models are from hyper scalers with tremendous computational resources (Open AI, Google - it is significant that Anthropic is not even in the race for video generation).

Somehow, Open AI was able to overtake labs like Google, Byte Dance, and Kuaishou that, in theory, have much better access to video data - but then again, Alibaba proved the same.

All in all, I'd say that the current state of affairs has the West about a generation ahead of China in video (Sora 2) and audio (Suno) generation, and less than a generation ahead in LLMs (but still ahead). Image generation is a closer race with Nano Banana being slightly better than Qwen 2509 but drastically held back by censorship - honestly it's not a significant gap.
 

Randomuser

Captain
Registered Member
Please, Log in or Register to view URLs content!

Anthropic’s ‘anti-China’ stance triggers exit of star AI researcher​

Chinese researcher Yao Shunyu joins Google DeepMind after Anthropic labels China as an ‘adversarial nation’​


A star Chinese artificial intelligence researcher has left Anthropic to join a rival company, citing the American AI start-up’s “anti-China statements” as a key reason for his departure.

Yao Shunyu, according to a post on his website on Monday, left Anthropic after less than a year to join Google DeepMind, partly because of his “strong” opposition to the start-up’s characterisation of China as an “adversarial nation” and its broader rhetoric.

Anthropic labelled China as so last month when it began barring subsidiaries of Chinese companies anywhere in the world from accessing its services.

“Although, to be clear, I believe most of the people at Anthropic disagree with such a [characterisation], I don’t think there is a way for me to stay,” Yao wrote.


I always wondered why Anthropic and Claude bothered me. Turns out they don't like us.


 

Nevermore

Junior Member
Registered Member
Some thoughts on the release of Sora 2 that's taken the world by storm.

With this release, I think it's fair to say that Open AI has taken the crown on video generation (closed source). Rather than post carefully curated Open AI demo videos, I'll instead link to these two videos which I think illustrates a capability that even Google's Veo 3 never reached:

Please, Log in or Register to view URLs content!

Please, Log in or Register to view URLs content!

The key here is that the videos are actually funny, and if you look up other examples of Sora 2's generations, you'll notice that it has an actual sense of scene composition and dramatic timing, which it combines with snappy generated dialogue, sound effects, and music to create an overall feeling that is more & more difficult to distinguish from human creators.

Of course, it's still got plenty of flaws, but if you compare it to the "robotic" videos from Veo 3, Kling, and Wan, it is plain that there's something different about Sora. Yes, Veo 3 was ground breaking for its ability to generate audio and video at the same time, but Sora 2 can actually direct.

If I had to guess, this is an intrinsically multi-modal model built off of a fine-tuned variant of GPT-5 trained for creative work, which combines the language and writing capabilities of LLMs with a video model trained off of however many works Open AI was able to steal from the internet and copy righted content (the fact the model understands celebrities, anime characters, etc. indicates it was trained on copy righted content).

The irony of the West contributing to its own social media waste land with more AI generated viral garbage aside, I reckon this places Western Big AI labs a full generation ahead of Chinese labs on video generation (where before, I would argue Veo 3 was about 0.5 generations ahead). It would seem that, for the time being, hyper scalers have been proven right about something, since the most dominant video models are from hyper scalers with tremendous computational resources (Open AI, Google - it is significant that Anthropic is not even in the race for video generation).

Somehow, Open AI was able to overtake labs like Google, Byte Dance, and Kuaishou that, in theory, have much better access to video data - but then again, Alibaba proved the same.

All in all, I'd say that the current state of affairs has the West about a generation ahead of China in video (Sora 2) and audio (Suno) generation, and less than a generation ahead in LLMs (but still ahead). Image generation is a closer race with Nano Banana being slightly better than Qwen 2509 but drastically held back by censorship - honestly it's not a significant gap.
Is it because Chinese companies still lack sufficient computing power? It remains to be seen whether Huawei's increased production of high-end chips can accelerate China's AI catch-up efforts.
 

tphuang

General
Staff member
Super Moderator
VIP Professional
Registered Member
Is it because Chinese companies still lack sufficient computing power? It remains to be seen whether Huawei's increased production of high-end chips can accelerate China's AI catch-up efforts.
no. How many time do I need to explain to people? China does not lack computing power?

Individual firms may lack computing power because they are not as well funded as this AI bubble in America.

As for quality of models, I wouldn't judge things just based on 1 member's comment. But it is entirely possible that Chinese model is behind in area and not the other.

It's pretty clear so far when it comes to video generation, people are just using it for mindless slops. I'm not sure how that's even useful.

I've been pretty clear for a while now that Qwen is going to win this, because they have the best set of products and they are all open source.
 

AI Scholar

New Member
Registered Member
Having seen Sora 2, it appears to follow a similar paradigm shift from non-reasoning to reasoning models. This new paradigm will likely be widely understood and adopted soon.

In raw video generation capabilities, China continues to dominate key benchmarks. I believe China will also quickly take the lead in this new paradigm now that OpenAI demonstrated it. For context, here is the current image-to-video leaderboard, where positions #1 through #9 are all held by Chinese models Screenshot_20251008_104448.jpg
We’re probably witnessing a repeat of last year’s dynamic. The U.S. surges ahead with a new generation of models, only for China to close the gap faster than expected with its own new generation. This only became evident to the West with DeepSeek R1, when China finally caught up, but the balance in AI research has been shifting in China’s favor for years, obvious to anyone tracking the trends.

Now China is in a much better position, with V3 leading in non-reasoning models, R1 still the leader in creative writing, Kling 2.0 as the best video model and with sound generation, Kolors 2.0 and Seedream 3.0 on par with the U.S. in image generation, Minimax highly competitive in audio generation and Mureka in music generation. All of those AIs are available globally. Just earlier last year, China had no AI services available at all other than DeepSeek. China is also decisively surpassing the U.S. in robotics and self-driving, the balance is shifting permanently. Unless AGI resets the race, China’s trajectory points to undisputed AI leadership across all domains within years. There might be moments when the U.S. pulls ahead again, but those will grow shorter and shorter, until AI becomes like EVs.

As I've said before, China's growth trajectory is mind-blowing. The gaps are shrinking incredibly fast, and China's overall position in AI grows stronger every month. We've gone from having barely any Chinese AI services that were competitive internationally to now leading in most key areas, with strong Chinese options available in every single field.

For example, Seedream 4.0 is still the best image model. It leads in text-to-image (1210 ELO vs. 1169 for Imagen 4) and is only 4 points behind in image editing. With models like Qwen leading in open-source, China is clearly ahead in image generation.

In audio, Minimax's Hailuo has the highest ELO in text-to-speech. I haven't seen Mureka (a Chinese international music generator) on leaderboards yet, but I believe it's competitive with other music generators. There's also ByteDance's own music generator on Doubao, which is available internationally and performs well, though it doesn't generate full songs in English.

So overall, China is leading in most key areas, like video, image, speech, open-source, non-reasoning LLMs, and cost-effective LLMs, and is quickly catching up in others, such as music, frontier reasoning LLMs, and the cinematic video of Sora 2.

The perception that Chinese AI is behind is largely a result of superior marketing by U.S labs, while the reality is increasingly moving in the opposite direction.
 

manqiangrexue

Brigadier
Please, Log in or Register to view URLs content!

Anthropic’s ‘anti-China’ stance triggers exit of star AI researcher​

Chinese researcher Yao Shunyu joins Google DeepMind after Anthropic labels China as an ‘adversarial nation’​


A star Chinese artificial intelligence researcher has left Anthropic to join a rival company, citing the American AI start-up’s “anti-China statements” as a key reason for his departure.

Yao Shunyu, according to a post on his website on Monday, left Anthropic after less than a year to join Google DeepMind, partly because of his “strong” opposition to the start-up’s characterisation of China as an “adversarial nation” and its broader rhetoric.

Anthropic labelled China as so last month when it began barring subsidiaries of Chinese companies anywhere in the world from accessing its services.

“Although, to be clear, I believe most of the people at Anthropic disagree with such a [characterisation], I don’t think there is a way for me to stay,” Yao wrote.

I always wondered why Anthropic and Claude bothered me. Turns out they don't like us.
But he joined Google?? If he left for patriotic reasons, he might as well join a Chinese company.
 
Top