Artificial Intelligence thread

Topazchen

Junior Member
Registered Member
Please, Log in or Register to view URLs content!

"A Chinese quantitative trading fund has submitted a research paper to one of the world’s top artificial intelligence (AI) conferences, detailing a new training technique that it said could outperform mainstream methods employed by leading AI research labs, in a move that mirrors the path taken by DeepSeek.
Shanghai Goku Technologies, established in 2015, submitted the paper to the Conference on Neural Information Processing Systems – an annual gathering of top scientists in machine learning and AI that is often referred to as the “AI Olympics”.

In its paper, Goku laid out the limits of popular AI training methods – including supervised fine-tuning (SFT) and reinforcement learning (RL) – and proposed a so-called step-wise adaptive hybrid training framework called SASR, which it said was inspired by the way humans develop reasoning capabilities.

SFT and RL are key techniques used by companies such as Microsoft-backed OpenAI and DeepSeek to train their AI models. DeepSeek previously highlighted the significance of SFT and RL in enhancing the performance of its V3 model, which made waves in the global technology community upon release in December.


“Experimental results demonstrate that SASR outperforms SFT, RL and static hybrid training methods,” the Goku team wrote in its paper, which was co-authored with researchers from Shanghai Jiao Tong University and Goku’s newly established AI subsidiary, Shanghai AllMind Artificial Intelligence Technology.

Goku, which operates under the slogan “logic and truth are the only principles we obey”, did not immediately respond to a request for comment on Thursday.

The company’s breakthrough in AI model training underscore China’s progress in this field, while showing the limits of Washington’s policies to curb the country’s AI advances through hardware restrictions.

Nvidia CEO Jensen Huang recently said that US curbs intended to contain China’s technological ascent were ineffective, noting that “China has 50 per cent of the world’s AI developers”.



China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips
In particular, DeepSeek – a prominent Chinese AI start-up that originated as a spin-off from the hedge fund High-Flyer – has generated significant attention by demonstrating China’s potential to achieve AI leadership through advances in algorithms and more efficient integration between hardware and software.
AllMind was registered on Monday, the same day Goku published its research, according to records from Chinese business registry service Qcc.com.

Goku founder Wang Xiao, listed as AllMind’s legal representative, said the start-up was created to explore the technological boundaries of AI models, according to a report from China Securities Journal.

Goku’s establishment of a dedicated facility for advanced research mirrors the approach taken by High-Flyer, which founded DeepSeek as an independent subsidiary in 2023.

As of the end of last year, Goku managed over 15 billion yuan (US$2.1 billion) in domestic and global assets, using AI-driven strategies, according to its official website"
 

Eventine

Junior Member
Registered Member
Honestly you have to release a model for your research to be taken seriously these days. Meta was pushing out tons of research but with the failure of Llama 4, they’ve basically been pushed out of the race by Google & Open AI. Deep Seek was doing lots of great research, but began to be taken seriously only post V3/R1.

More Chinese companies need to enter the global AI race. Can’t afford another Google moment where Chinese platforms (looking at you Baidu) only survive in China because of the closed market and not because they are globally competitive, while the West dominates every where else. Platform advantages are compounding; Google’s successes in AI are a consequence of their search engine dominance.
 

Eventine

Junior Member
Registered Member
Anthropic just dropped their latest series of models, Claude 4 (Opus + Sonnet)

The main improvements are in a category called "agentic coding" where it greatly improves on state of the art (probably the one area Anthropic has historically been dominant in). Everything else is <= the state of the art from Google and Open AI.

1747935030939.png

Just need Grok 3.5 now to complete the newest generation of Western models, and now we wait for the Chinese response (since Chinese models have now effectively fallen 1 generation behind).
 
Last edited:

solarz

Brigadier
Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.
 

MortyandRick

Senior Member
Registered Member
Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.
You should try qwen. I find it much Better than Deepseek and chatgpt.
 

Michael90

Junior Member
Registered Member
Pretty disappointed in Deepseek right now. I tried asking it some sensitive questions about US/Israel relations, and it could only regurgitate mainstream talking points. When I tried guiding the conversation toward some sensitive points, it crapped out with the "server busy" message. I tried asking the same thing in Chinese, and it crapped out right away.

In contrast, Chatgpt was able to hold a decent conversation about it.
Lol don't take it too seriously. Imo Deepseek has done its job well. I. E positioning CHINESE AI at the forefront on the world and making it more visible /competitive. To be honest, I don't know if DeepSeek will be able keep up in the long term with competitors from huge corporations like Alibaba, Google, Microsoft, tencent, Open AI, etc who have more capital and long term focus on this field, since most of their future business will rely heavily on AI. Hence they have every reason to keep on investing heavily in this field going forward. Compared to Deepseek who seems to be doing it more as a side business as it wasn't even their main focus in the beginning. Remains to be seen, if they will get much more involved/focused in this field going forward in future.
 
Last edited:
Top