It's strange today to remember that playing chess well was seen as a great marker of AI, but today we consider it much less so.
I thought Turing's Test would be a good barometer of AI, but in today's World of mountains of AI slop fooling more and more people, and ironically there being software that is better at solving CAPTCHAs than humans, I'm not so sure.
Add into the mix that there are reports of people developing psychological disorders when exposed deeply to LLMs, I'm not sure they are good replacements for therapists (ELIZA, ah, what a thought), and they seem - even with a lot of investment in agentic workflows and getting a lot of context into GraphRAG or wiring up MCP - to be good at helping experts get a bit faster, not replace experts. And that's not software development specific - it seems to be the case across all domains of expertise.
So what are we now chasing for? What's the test for AGI?
It's definitely not playing games well, like we thought, or pretending to be human, or even being useful to a human. What is it, then?