> model collapse happens when the training data no longer matches real-world data
This isn't a significant issue IMO, as human-created content isn't "real-world" per se; it's human-created world, an interpretation and representation of the real. The real world is the raw data perceived by sensors, human or machine. And while model-generated content doesn't match human-created content well, in the vast majority of cases it's still humans curating, modifying and publishing generated content, based on how useful it is (there are of course spammers, etc but that's a general issue). This is something humans do with content created by other humans too.
So over time generated content will become a sort of norm adopted by and inevitably molding humans, same as created content does. Instead of model collapse, both sources of content will converge over time, particularly as the ability to also generate content directly from the real world is developed and integrated into multi-modal models.