Artificial Intelligence thread

Eventine · Mar 1, 2025

Ali Qizilbash said:
Just did a Sesame demo.... that is one heck of a nice voice model for interaction... impressed

The authors of the tech. report are: Johan Schalkwyk, Ankit Kumar, Dan Lyth, Sefik Emre Eskimez, Zack Hodari, Cinjon Resnick, Ramon Sanabria, Raven Jiang

I only spy one Chinese last name in the list, which is surprising, although most of these names appear not American, either; the depth of the American talent pool due to immigration should not be underestimated.

tphuang · Mar 1, 2025

My post on this and someone response to this

https://twitter.com/i/web/status/1895809234379686259

the key is that DeepSeek inference is really efficient and this can be scaled up.

I don’t know how much they need to scale up data Centers, but clearly China’s current data centers are enough to meet all this increased demand.

Hyper · Mar 1, 2025

tphuang said:
My post on this and someone response to this

https://twitter.com/i/web/status/1895809234379686259

the key is that DeepSeek inference is really efficient and this can be scaled up.

I don’t know how much they need to scale up data Centers, but clearly China’s current data centers are enough to meet all this increased demand.

What DeepSeek has achieved is so impressive. It couldn't be achieved by BAT, Bytedance, Closed ai, Google etc. Deepseek's quant values mean that it has extremely talented and skilled systems engineering. Building everything from scratch is very impressive. No one else could have achieved it.

tankphobia · Mar 1, 2025

Honestly this current AI arms race is quite terrifying. No safety mechanisms are being put in place to ensure that it will not be misused and there won't be due to the potential of a GAI. With huge strides being made in robotics and AI, what will billions of people do for work once they can automate everything?

Instead of liberating humans for creative work, AI seems quite capable of just replacing artist, poets and writers.

Legume7 · Mar 1, 2025

Coping and seething from an OpenAI research scientist:

https://twitter.com/i/web/status/1895876331495477615

Every American lab has shown their hand post R1, and none managed to convince. At this point, the entire $10T+ house of cards is going to start coming down once R2 is released in March/April (likely April).

OpenAI is screwed the most. Their largest investor is Microsoft, whose largest source of revenue is from Azure (cloud). Azure is already serving DeepSeek V3/R1 and will have to implement DeepSeek's latest optimizations as a matter of basic market economics. But doing so would increase DeepSeek's market share and damage the perception of Microsoft as an AI leader, which is what their ~$3T valuation is based off of. GPT-4.5 is a flop, and OpenAI were caught lying on the model card multiple times. First, they compared GPT-4.5 to o1-mini, but referred to it as the full o1. They also claimed a 10x increase in training efficiency, but removed the claim once people pointed out it was incompatible with their token costs. o3 was so computationally expensive ($3000 per query on highest thinking time) that they cannot release it as a standalone model, instead having to fold it into GPT-5. They were also caught cheating the FrontierMath benchmark last month.

Anthropic is in a similar situation. Claude-3.7 is impressive, but too expensive for its performance. I assume they are also bleeding billions per year. Their biggest investor Amazon, and the point about Azure and OpenAI applies equally to AWS and Anthropic. Also, Anthropic's CEO Dario Amodei claimed that DeepSeek smuggled GPUs and were lying about their numbers. He's now caught with his pants down given today's release. He also has a history of sketchy behavior. He admitted that he discovered scaling laws while working at Baidu, but when he moved to OpenAI and published a paper on scaling laws, he didn't cite Baidu's work. The mythology that the founders of OpenAI and Anthropic (who are ex-OpenAI) discovered scaling laws in one of the reasons they are still perceived to be leading in AI.

Meta Generative AI will also throw in the towel soon. Llama 4 was delayed due to R1, and if they don't get it out before R2, it's game over. Even if they do, it's unlikely it will outperform R1. Their ads team is already using DeepSeek as opposed to LLama.

x and Google will stay around longer because of their comparatively larger financial resources. They are willing to bleed money to serve their models at low cost. Gemini isn't impressive, and Grok barely outperformed DeepSeek on benchmarks which they specifically trained for. Also, while all American labs are bleeding cash, I suspect x is bleeding the most relative to its market share. x has the largest cluster and training costs, and they have nowhere near the most users. I also have inside information that x is paying many people thousands per week to solve math/coding problems so their solutions can be used to train Grok. This isn't factored into the publicly revealed costs, and also means that their algorithmic improvements are quite poor. I don't see how they can scale this up either.

Nvidia is also screwed. Jevon's paradox doesn't apply when existing GPUs can meet all inference needs. They will still have a market for training, but even then, the only major purchasers of Blackwell are the above 5 companies, the exact 5 companies who are bleeding billions while still struggling against DeepSeek.

For Chinese companies, as mentioned earlier in this thread, 4 companies are alive in the race for AGI: DeepSeek, Moonshot (makers of kimi), ByteDance, and Alibaba. These are the only labs with reasoning models. All of them have a positive outlook, and it's worth noting that Alibaba is the largest investor of Moonshot. On the other hand, the head of Qwen left for ByteDance, which indicates that ByteDance has more resources and/or desirability. There will also be a Qwen release next week.

https://twitter.com/i/web/status/1895871796228096236

Of course, other Chinese companies have very strong models for specialized tasks, but I'm only referring to AGI, since China dominates those categories already.

iewgnem · Mar 1, 2025

Legume7 said:
Coping and seething from an OpenAI research scientist:
https://twitter.com/i/web/status/1895876331495477615

Every American lab has shown their hand post R1, and none managed to convince. At this point, the entire $10T+ house of cards is going to start coming down once R2 is released in March/April (likely April).

OpenAI is screwed the most. Their largest investor is Microsoft, whose largest source of revenue is from Azure (cloud). Azure is already serving DeepSeek V3/R1 and will have to implement DeepSeek's latest optimizations as a matter of basic market economics. But doing so would increase DeepSeek's market share and damage the perception of Microsoft as an AI leader, which is what their ~$3T valuation is based off of. GPT-4.5 is a flop, and OpenAI were caught lying on the model card multiple times. First, they compared GPT-4.5 to o1-mini, but referred to it as the full o1. They also claimed a 10x increase in training efficiency, but removed the claim once people pointed out it was incompatible with their token costs. o3 was so computationally expensive ($3000 per query on highest thinking time) that they cannot release it as a standalone model, instead having to fold it into GPT-5. They were also caught cheating the FrontierMath benchmark last month.

Anthropic is in a similar situation. Claude-3.7 is impressive, but too expensive for its performance. I assume they are also bleeding billions per year. Their biggest investor Amazon, and the point about Azure and OpenAI applies equally to AWS and Anthropic. Also, Anthropic's CEO Dario Amodei claimed that DeepSeek smuggled GPUs and were lying about their numbers. He's now caught with his pants down given today's release. He also has a history of sketchy behavior. He admitted that he discovered scaling laws while working at Baidu, but when he moved to OpenAI and published a paper on scaling laws, he didn't cite Baidu's work. The mythology that the founders of OpenAI and Anthropic (who are ex-OpenAI) discovered scaling laws in one of the reasons they are still perceived to be leading in AI.

Meta Generative AI will also throw in the towel soon. Llama 4 was delayed due to R1, and if they don't get it out before R2, it's game over. Even if they do, it's unlikely it will outperform R1. Their ads team is already using DeepSeek as opposed to LLama.

x and Google will stay around longer because of their comparatively larger financial resources. They are willing to bleed money to serve their models at low cost. Gemini isn't impressive, and Grok barely outperformed DeepSeek on benchmarks which they specifically trained for. Also, while all American labs are bleeding cash, I suspect x is bleeding the most relative to its market share. x has the largest cluster and training costs, and they have nowhere near the most users. I also have inside information that x is paying many people thousands per week to solve math/coding problems so their solutions can be used to train Grok. This isn't factored into the publicly revealed costs, and also means that their algorithmic improvements are quite poor. I don't see how they can scale this up either.

Nvidia is also screwed. Jevon's paradox doesn't apply when existing GPUs can meet all inference needs. They will still have a market for training, but even then, the only major purchasers of Blackwell are the above 5 companies, the exact 5 companies who are bleeding billions while still struggling against DeepSeek.

For Chinese companies, as mentioned earlier in this thread, 4 companies are alive in the race for AGI: DeepSeek, Moonshot (makers of kimi), ByteDance, and Alibaba. These are the only labs with reasoning models. All of them have a positive outlook, and it's worth noting that Alibaba is the largest investor of Moonshot. On the other hand, the head of Qwen left for ByteDance, which indicates that ByteDance has more resources and/or desirability. There will also be a Qwen release next week.

https://twitter.com/i/web/status/1895871796228096236

Of course, other Chinese companies have very strong models for specialized tasks, but I'm only referring to AGI, since China dominates those categories already.

Deepseek's profit margin would look very different if they shorted NVIDIA before releasing R1 lol

Hyper · Mar 1, 2025

Legume7 said:
Coping and seething from an OpenAI research scientist:
https://twitter.com/i/web/status/1895876331495477615

Every American lab has shown their hand post R1, and none managed to convince. At this point, the entire $10T+ house of cards is going to start coming down once R2 is released in March/April (likely April).

OpenAI is screwed the most. Their largest investor is Microsoft, whose largest source of revenue is from Azure (cloud). Azure is already serving DeepSeek V3/R1 and will have to implement DeepSeek's latest optimizations as a matter of basic market economics. But doing so would increase DeepSeek's market share and damage the perception of Microsoft as an AI leader, which is what their ~$3T valuation is based off of. GPT-4.5 is a flop, and OpenAI were caught lying on the model card multiple times. First, they compared GPT-4.5 to o1-mini, but referred to it as the full o1. They also claimed a 10x increase in training efficiency, but removed the claim once people pointed out it was incompatible with their token costs. o3 was so computationally expensive ($3000 per query on highest thinking time) that they cannot release it as a standalone model, instead having to fold it into GPT-5. They were also caught cheating the FrontierMath benchmark last month.

Anthropic is in a similar situation. Claude-3.7 is impressive, but too expensive for its performance. I assume they are also bleeding billions per year. Their biggest investor Amazon, and the point about Azure and OpenAI applies equally to AWS and Anthropic. Also, Anthropic's CEO Dario Amodei claimed that DeepSeek smuggled GPUs and were lying about their numbers. He's now caught with his pants down given today's release. He also has a history of sketchy behavior. He admitted that he discovered scaling laws while working at Baidu, but when he moved to OpenAI and published a paper on scaling laws, he didn't cite Baidu's work. The mythology that the founders of OpenAI and Anthropic (who are ex-OpenAI) discovered scaling laws in one of the reasons they are still perceived to be leading in AI.

Meta Generative AI will also throw in the towel soon. Llama 4 was delayed due to R1, and if they don't get it out before R2, it's game over. Even if they do, it's unlikely it will outperform R1. Their ads team is already using DeepSeek as opposed to LLama.

x and Google will stay around longer because of their comparatively larger financial resources. They are willing to bleed money to serve their models at low cost. Gemini isn't impressive, and Grok barely outperformed DeepSeek on benchmarks which they specifically trained for. Also, while all American labs are bleeding cash, I suspect x is bleeding the most relative to its market share. x has the largest cluster and training costs, and they have nowhere near the most users. I also have inside information that x is paying many people thousands per week to solve math/coding problems so their solutions can be used to train Grok. This isn't factored into the publicly revealed costs, and also means that their algorithmic improvements are quite poor. I don't see how they can scale this up either.

Nvidia is also screwed. Jevon's paradox doesn't apply when existing GPUs can meet all inference needs. They will still have a market for training, but even then, the only major purchasers of Blackwell are the above 5 companies, the exact 5 companies who are bleeding billions while still struggling against DeepSeek.

For Chinese companies, as mentioned earlier in this thread, 4 companies are alive in the race for AGI: DeepSeek, Moonshot (makers of kimi), ByteDance, and Alibaba. These are the only labs with reasoning models. All of them have a positive outlook, and it's worth noting that Alibaba is the largest investor of Moonshot. On the other hand, the head of Qwen left for ByteDance, which indicates that ByteDance has more resources and/or desirability. There will also be a Qwen release next week.

https://twitter.com/i/web/status/1895871796228096236

Of course, other Chinese companies have very strong models for specialized tasks, but I'm only referring to AGI, since China dominates those categories already.

I don't think either BAT or Bytedance stands a chance against DeepSeek. They will lose. We probably have a winner. Also paying people to solve the math questions is not bad. Springer is notorious for incomplete proofs and examples left as excercise to the reader.

Eventine · Mar 1, 2025

Hyper said:
I don't think either BAT or Bytedance stands a chance against DeepSeek. They will lose. We probably have a winner. Also paying people to solve the math questions is not bad. Springer is notorious for incomplete proofs and examples left as excercise to the reader.

I don't know why people are so prematurely calling the outcomes. The global race is just beginning, and we should hope for more competitors from China, not less, as there are still plenty of hungry, ambitious, and capable start ups emerging from the West. The Sesame demo from earlier being a great example - no Chinese company has yet to achieve that.

It's like EV. China has, by far, the most dynamic and competitive EV industry. Just hook at how many Chinese contenders there are in EVs. If it were just BYD by itself, it'd not be a healthy scene and it'd be easy for the West to target. It's only when you reach critical domestic saturation that you start the process towards external domination.

Deep Seek won the opening battle, but this field moves fast enough that if Deep Seek v4 is a bust, it could quickly shift to another player.

Hyper · Mar 1, 2025

Eventine said:
I don't know why people are so prematurely calling the outcomes. The global race is just beginning, and we should hope for more competitors from China, not less, as there are still plenty of hungry, ambitious, and capable start ups emerging from the West. The Sesame demo from earlier being a great example - no Chinese company has yet to achieve that.

It's like EV. China has, by far, the most dynamic and competitive EV industry. Just hook at how many Chinese contenders there are in EVs. If it were just BYD by itself, it'd not be a healthy scene and it'd be easy for the West to target. It's only when you reach critical domestic saturation that you start the process towards external domination.

Deep Seek won the opening battle, but this field moves fast enough that if Deep Seek v4 is a bust, it could quickly shift to another player.

All other analogies are incomplete. DeepSeek can retain talent far better than others. It has a profitable division in High Flyer. DeepSeek is the only profitable llm in the world.

Legume7 · Mar 1, 2025

I never said that DeepSeek would win the race to "AGI." I simply said that today's reveal of DeepSeek's financials exposes that USA big tech can never hope to justify their combined valuation of over $10T, which is based off of the belief that they have no competitors for "AGI." Right now, the five labs are bankrupting themselves to a combined stalemate against DeepSeek, while DeepSeek is already profitable. Furthermore, the rate of performance increase for their best models has stalled, which calls into question their past claims. Before they could justify their billion dollar losses, but now investors will start asking questions. A good analogy is what is happening to Tesla in the auto industry.

Artificial Intelligence thread

Eventine

Senior Member

tphuang

General

Hyper

Junior Member

tankphobia

Senior Member

Legume7

New Member

iewgnem

Senior Member

Hyper

Junior Member

Eventine

Senior Member

Hyper

Junior Member

Legume7

New Member