I did some experimenting with this a little while back and was disappointed in how poorly LLMs played games.
I made some AI tools (https://github.com/DougHaber/lair) and added in a tmux tool so that LLMs could interact with terminals. First, I tried Nethack. As expected, it's not good at understanding text "screenshots" and failed miserably.
https://x.com/LeshyLabs/status/1895842345376944454
After that I tried a bunch of the "bsdgames" text games.
Here is a video of it playing a few minutes of Colossal Cave Adventure:
https://www.youtube.com/watch?v=7BMxkWUON70
With this, it could play, but not very well. It gets confused a lot. I was using gpt-4o-mini. Smaller models I could run at home work much worse. It would be interesting to try one of the bigger state of the art models to see how much it helps.
To give it an easier one I also had it hunt the Wumpus:
https://x.com/LeshyLabs/status/1896443294005317701
I didn't try improving this much, so there might be some low hanging fruit even in providing better instructions and tuning what is sent to the LLM. For these, I was hoping I could just hand it a terminal with a game in it and have it play decently. We'll probably get there, but so far it's not that simple.