I'm into VR and mixed reality, and I think this is headed to making the Holodeck real in an immersive way. That's the concept of the Matrix and what they are demoing, just in 2d.
I am guessing the main thing holding this stuff back in terms of fidelity and consistency or generalization is just compute. But the new techniques they have here have just dramatically lowered the compute costs and increased the generalization.
Maybe just something like the giant Cerebras SRAM chips will get to the next 10 X in scale that smooths this out and pushes it closer to Star Trek. Or maybe some new paradigm like memristors.
But I'm looking forward to within just a few years being able to put on some fairly comfortable mixed reality glasses and just asking for whatever or whoever I want to appear in my home (for example) according to my whim.
Or, train it on a lot of how-to videos such as cooking. It just materializes an example of someone showing you exactly what you need to do right in your kitchen.
Here's another crazy idea: train on videos and interactions with productivity applications rather than games. In the future, for small businesses, we skip having the AI generate source code and just describe how the application works. The data and program state are just stored in a giant context window, and the application functionality changes the instant you make a request.