Discussion about this post

User's avatar
richardstevenhack's avatar

In fact, Mythos is only a little bit better than most large models, probably because it was specifically engineered to pass the tests on exactly that capability.

Other studies have shown you can get similar results with even SMALL locally run models.

Mythos was primarily a PR stunt to get Anthropic on government contracting rolls after the Pentagon fiasco.

And patching is hardly the answer because the whole problem of Mythos and the related cybersecurity impact is precisely because TOO MANY BUGS CAN BE FOUND FOR TIMELY REMEDIATION TO BE DONE.

So this article completely blew it. Not to mention it's late to the game as this discussion has been going on for over a month now.

The real answer to all of this is to design AI - NOT LLMs which are not capable - to produce provably correct code - whether at scale or not.

This is because software "engineering" - isn't engineering at all. It is a craft and always has been. Software produced has been barely usable, unreliable, buggy and insecure since forever.

The entire industry is built on sand.

All the LLMs have done is prove it.

And on top of that, they've made it worse because AI-generated code produces between 45% and 92% insecure code, depending on whose study you trust. This is precisely because humans produce crap code, and LLMs were trained on human code.

The entire software industry now has to look in the mirror and admit it needs to change.

Pirate HAK's avatar

Claude is a cripple it’s locked away for a reason because they don’t have enough compute

No posts

Ready for more?