Artificial Intelligence thread

lych470

Junior Member
Registered Member
It's not creating the model to create competitive advantage that's interesting
It's releasing the model so everyone, including other hedge funds can share in your advantage, which nullify your advantage, that's interesting.

The PLA military procurement's philosophy is '探索一代、预研一代、研制一代、生产一代‘ - 'explore a generation, research a generation, develope a generation, produce a generation' - I imagine there would be similar ethos in Chinese AI companies. I would be very surprised if they don't have something that's better up their sleeves.
 

luminary

Senior Member
Registered Member
DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview
Please, Log in or Register to view URLs content!
So for tokens per second:
Cerebras ??: 1600
Nvidia H200: 3800

I don't think Steve hsu knows what he's talking about. Cerebras, Groq, Sambanova, etc are gimmicks.

Cerebras probably claimed the 57x number off of this.
Please, Log in or Register to view URLs content!
Indeed on that page you funnily enough see that 57x number come up again, that being the WSE-3 is 57x bigger than a H100's die...
 
Last edited:

GulfLander

Colonel
Registered Member
"UC Berkeley researchers have developed a small-scale language model reproduction of DeepSeek R1-Zero, an AI language model developed in China, for about $30.[...]
The language model TinyZero is a project led by campus graduate researcher Jiayi Pan and three other researchers, advised by campus professor Professor Alane Suhr and University of Illinois at Urbana-Champaign assistant professor Hao Peng.[...]TinyZero is a small-scale reproduction, with the $30 price going toward server costs to run the experiments. TinyZero is “only useful for very restricted types of tasks” such as countdown and multiplication tasks[...]"

Please, Log in or Register to view URLs content!
 
Top