Artificial Intelligence thread

mossen

Senior Member
Registered Member
1.png


There have been many false dawns but is V4 finally on the horizon? 1.6T is pretty decent, but you really need 10T to compete with Mythos-tier models. I guess that's for next year or the year after that.

Kimi has been going from strength to strength so it's time to finally see if DeepSeek can take the crown of "best AI startup in China" back.
 

HighGround

Senior Member
Registered Member
There have been many false dawns but is V4 finally on the horizon? 1.6T is pretty decent, but you really need 10T to compete with Mythos-tier models. I guess that's for next year or the year after that.

Kimi has been going from strength to strength so it's time to finally see if DeepSeek can take the crown of "best AI startup in China" back.
I'm very skeptical of the Mythos hype. I think there's a very good reason they're not releasing it, and it has little to do with their concern over "cyber-security".
 

9dashline

Captain
Registered Member
I'm very skeptical of the Mythos hype. I think there's a very good reason they're not releasing it, and it has little to do with their concern over "cyber-security".
its 10 trillion in size, twice as expensive as opus... not unlike back when openai was scaling test time compute on o1/o2/o3 models and spenting $1000 per task (for work that cost a human $5) and saying it was high on AGI benchmarks
 

HighGround

Senior Member
Registered Member
its 10 trillion in size, twice as expensive as opus... not unlike back when openai was scaling test time compute on o1/o2/o3 models and spenting $1000 per task (for work that cost a human $5) and saying it was high on AGI benchmarks
I don't believe it's just cost and compute resources required. Though yes, I think that's the #1 problem for Anthropic. The compute resources and costs required to deploy this are undoubtedly enormous.

I think they have equally large problems with just deploying it and keeping it stable. There's probably a significant learning curve to just run this model.

I also suspect that the behaviors and outputs get, maybe not worse, but "weirder". I don't know if others have occasionally had an experience where the code, output or answer you get from Claude (or GPT Pro) is just a weird. Overexplained, overcomplicated, or somewhat strange. I think this kind of behavior could be worse as you scale up. Plus any other kind of unexpected or strange behaviors that come from a model that's a magnitude larger than what we're used to seeing.

----------------------------------------------------------------------------

Anyway, I'm not an AI person at all, just a user. But my suspicion is that this "Mythos cybersecurity" nonsense is just a clever marketing ruse. If you just look at Opus 4.7 vs 4.6, it's a nice but pretty "meh" bump in performance. What would the conversation be if "Mythos" didn't exist or wasn't made public?

We'd probably be hearing about how Chinese models are starting to catch up again with Kimi 2.6 nipping at the heels of Opus... but thanks to the Mythos hype we're still living in a world where the consensus is that US AI giants are meaningfully ahead.
 

SanWenYu

Major
Registered Member
I'm very skeptical of the Mythos hype. I think there's a very good reason they're not releasing it, and it has little to do with their concern over "cyber-security".

The
Please, Log in or Register to view URLs content!
team think it is real and as capable as "elite (human) security researchers":
Elite security researchers find bugs that fuzzers can’t largely by reasoning through the source code. This is effective, but time-consuming and bottlenecked on scarce human expertise. Computers were completely incapable of doing this a few months ago, and now they excel at it. We have many years of experience picking apart the work of the world’s best security researchers, and Mythos Preview is every bit as capable. So far we’ve found no category or complexity of vulnerability that humans can find that this model can’t.

Though Mozilla didn't disclose how many false positives they had. Hopefully they will have followup posts in the future.
 

tokenanalyst

Lieutenant General
Registered Member
The
Please, Log in or Register to view URLs content!
team think it is real and as capable as "elite (human) security researchers":


Though Mozilla didn't disclose how many false positives they had. Hopefully they will have followup posts in the future.
I think there is a psychosis spreading with this technology mostly propup by the hype of US AI CEOs. LLMs are mostly databases with probabilistic outputs. They don't have nuance or conscience of the things they are outputting and that is dangerous because these databases output content faster than humans with nuance can review.
 

9dashline

Captain
Registered Member
The
Please, Log in or Register to view URLs content!
team think it is real and as capable as "elite (human) security researchers":


Though Mozilla didn't disclose how many false positives they had. Hopefully they will have followup posts in the future.
China needs to steal the weights to Mythos,but Anthropic is guarding it more securely than nuclear codes, way more than they guarded the Claude Code source code...

The original US plan was to get to Mythos level, while China hopelessly behind (think llama 2) then ride the global AI API tax to save the petrodollar hegemony

Instead Kimi2.6 is Opus 4.5 tier and Qwen3.6 is Sonnet level... and the Iran war went sideways. and China banning rare earth exports, half of US data centers got delayed
 

Eventine

Senior Member
Registered Member
Open AI appears to be gearing up for a big release.

On the leader boards, GPT Image 2 appears to be a full +250 points above its closest competitors in Text to Image, which is a generational difference. This is probably due to the fact that the model actually does multiple passes to generate an image - kind of like a thinking mode, but for images. Naturally this balloons the cost of running the model and returning results in a reasonable amount of time, so it can really only be practical service-side with hyper scaled inference compute.

There is also rumors of a GPT 5.5 that should be competitive or better than Opus 4.7. I don't think it'd be Mythos level, though.

Either way, hyper scaling appears to not have reached its ceiling just yet, especially in the context of incorporating "thinking mode" into image and video models. I wouldn't be surprised if Western labs are working on just that behind the scenes in an effort to leap frog the current dominance of Chinese video models.

Although, it remains to be seen if any of this is actually cost efficient or sustainable from a market perspective. Will people really pay $500 to generate a high quality, 5 minutes video?
 
Top