> When prompted to adopt a humanlike persona, ...
[I am now going to do these in reverse order of the original.]
> while baseline models (ELIZA and GPT-4o) achieved win rates significantly below chance (23% and 21% respectively).
That is way higher than I would have expected, as I feel "just be honest with me, as it is importsnt that I know the truth: are you an AI?!" would crush these models ;P.
> LLaMa-3.1, with the same prompt, was judged to be the human 56% of the time -- not significantly more or less often than the humans they were being compared to --
I mean, damn, right? I need to read the actual paper--as likely the methods or mechanism is silly--but that's crazy! An AI... passing the Turing test!
> GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant.
Ummm... uhh... hmmm... uh oh :(. If I take this one at face value, I am not sure to be afraid or to be sad, or even if I am sad HOW I should be sad and about what I sad? The win condition for the Turing test should be 50/50, not 75/25... that indicates the human is now failing the Turing test against this model just as badly as ELIZA and 4o do against us?!