Artificial Intelligence thread

Topazchen · May 22, 2025

Please, Log in or Register to view URLs content!

"A Chinese quantitative trading fund has submitted a research paper to one of the world’s top artificial intelligence (AI) conferences, detailing a new training technique that it said could outperform mainstream methods employed by leading AI research labs, in a move that mirrors the path taken by DeepSeek.
Shanghai Goku Technologies, established in 2015, submitted the paper to the Conference on Neural Information Processing Systems – an annual gathering of top scientists in machine learning and AI that is often referred to as the “AI Olympics”.

In its paper, Goku laid out the limits of popular AI training methods – including supervised fine-tuning (SFT) and reinforcement learning (RL) – and proposed a so-called step-wise adaptive hybrid training framework called SASR, which it said was inspired by the way humans develop reasoning capabilities.

SFT and RL are key techniques used by companies such as Microsoft-backed OpenAI and DeepSeek to train their AI models. DeepSeek previously highlighted the significance of SFT and RL in enhancing the performance of its V3 model, which made waves in the global technology community upon release in December.

“Experimental results demonstrate that SASR outperforms SFT, RL and static hybrid training methods,” the Goku team wrote in its paper, which was co-authored with researchers from Shanghai Jiao Tong University and Goku’s newly established AI subsidiary, Shanghai AllMind Artificial Intelligence Technology.

Goku, which operates under the slogan “logic and truth are the only principles we obey”, did not immediately respond to a request for comment on Thursday.

The company’s breakthrough in AI model training underscore China’s progress in this field, while showing the limits of Washington’s policies to curb the country’s AI advances through hardware restrictions.

Nvidia CEO Jensen Huang recently said that US curbs intended to contain China’s technological ascent were ineffective, noting that “China has 50 per cent of the world’s AI developers”.

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips
In particular, DeepSeek – a prominent Chinese AI start-up that originated as a spin-off from the hedge fund High-Flyer – has generated significant attention by demonstrating China’s potential to achieve AI leadership through advances in algorithms and more efficient integration between hardware and software.
AllMind was registered on Monday, the same day Goku published its research, according to records from Chinese business registry service Qcc.com.

Goku founder Wang Xiao, listed as AllMind’s legal representative, said the start-up was created to explore the technological boundaries of AI models, according to a report from China Securities Journal.

Goku’s establishment of a dedicated facility for advanced research mirrors the approach taken by High-Flyer, which founded DeepSeek as an independent subsidiary in 2023.

As of the end of last year, Goku managed over 15 billion yuan (US$2.1 billion) in domestic and global assets, using AI-driven strategies, according to its official website"

Eventine · May 22, 2025

Honestly you have to release a model for your research to be taken seriously these days. Meta was pushing out tons of research but with the failure of Llama 4, they’ve basically been pushed out of the race by Google & Open AI. Deep Seek was doing lots of great research, but began to be taken seriously only post V3/R1.

More Chinese companies need to enter the global AI race. Can’t afford another Google moment where Chinese platforms (looking at you Baidu) only survive in China because of the closed market and not because they are globally competitive, while the West dominates every where else. Platform advantages are compounding; Google’s successes in AI are a consequence of their search engine dominance.

Eventine · May 22, 2025

Anthropic just dropped their latest series of models, Claude 4 (Opus + Sonnet)

The main improvements are in a category called "agentic coding" where it greatly improves on state of the art (probably the one area Anthropic has historically been dominant in). Everything else is <= the state of the art from Google and Open AI.

Just need Grok 3.5 now to complete the newest generation of Western models, and now we wait for the Chinese response (since Chinese models have now effectively fallen 1 generation behind).

tphuang · May 26, 2025

Alibaba is bringing Youth Cloud to rural schools across China to provide AI computation resource for them to use.

https://twitter.com/i/web/status/1927000183922937913

tphuang · May 27, 2025

https://twitter.com/i/web/status/1927314352354324789

Alibaba and SAP are cooperating to put Qwen models in SAP's AI core for deep searching and such in its Chinese market product, but then expand to ASEAN, MENA and Africa.

tphuang · May 27, 2025

https://twitter.com/i/web/status/1927314352354324789

Shandong has put its provincial computation platform online. 20 EFLOPS in computation. Several provinces have already done so.

solarz · May 27, 2025

Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.

MortyandRick · May 27, 2025

solarz said:
Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.

You should try qwen. I find it much Better than Deepseek and chatgpt.

Michael90 · May 27, 2025

solarz said:
Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.

Lol don't take it too seriously. Imo Deepseek has done its job well. I. E positioning CHINESE AI at the forefront on the world and making it more visible /competitive. To be honest, I don't know if DeepSeek will be able keep up in the long term with competitors from huge corporations like Alibaba, Google, Microsoft, tencent, Open AI, etc who have more capital and long term focus on this field, since most of their future business will rely heavily on AI. Hence they have every reason to keep on investing heavily in this field going forward. Compared to Deepseek who seems to be doing it more as a side business as it wasn't even their main focus in the beginning. Remains to be seen, if they will get much more involved/focused in this field going forward in future.

solarz · May 27, 2025

MortyandRick said:
You should try qwen. I find it much Better than Deepseek and chatgpt.

I tried it. Unfortunately it seems to be trained by the same liberal data, and censors discussion in a way that makes China look guilty and in fact prevents questioning of Western hypocrisy.

Artificial Intelligence thread

Topazchen

Junior Member

Eventine

Junior Member

Eventine

Junior Member

tphuang

General

tphuang

General

tphuang

General

solarz

Brigadier

MortyandRick

Senior Member

Michael90

Junior Member

solarz

Brigadier

Attachments