HN Reader

DeepMind program finds diamonds in Minecraft without being taught

247

160

>Moreover, we follow previous work in accelerating block breaking because learning to hold a button for hundreds of consecutive steps would be infeasible for stochastic policies, allowing us to focus on the essential challenges inherent in Minecraft.

2 days agoby suddenlybananas

Key to Dreamer’s success, says Hafner, is that it builds a model of its surroundings and uses this ‘world model’ to ‘imagine’ future scenarios and guide decision-making.

Can you look at the world model, like you can look at Waymo's world model? Or is it hidden inside weights?

Machine learning with world models is very interesting, and the people doing it don't seem to say much about what the models look like. The Google manipulation work talks endlessly about the natural language user interface, but when they get to motion planning, they don't say much.

2 days agoby Animats

The “holding a button” thing actually resonated. It feels like the real work here is engineering the reward structure to make exploration even remotely viable. Dreamer’s world model might be cool, but most of the heavy lifting still seems to come from how forgiving the Minecraft environment is for training.

I do wonder though: if you swapped Minecraft for a cloud-based synthetic world with similar physics but messier signals, like object permanence or social reasoning, would Dreamer still hold up? Or is it just really good at the kind of clean reward hierarchies that games offer?

1 day agoby DeborahEmeni_

Article makes it seem like finding diamonds is some kind of super complicated logical puzzle. In reality the hardest part is knowing where to look for them and what tool you need to mine them without losing them once you find them. This was given to the AI by having it watch a video that explains it.

If you watch a guide on how to find diamonds it's really just a matter of getting an iron pickaxe, digging to the right depth and strip mining until you find some.

2 days agoby reportgunner

Characterizing finding diamonds as "mastering" Minecraft is extremely silly. Tantamount to saying "AI masters Chess: Captures a pawn." Getting diamonds is not even close to the hardest challenge in the game, but most readers of Nature probably don't have much experience playing Minecraft so the title is actually misleading, not harmless exaggeration.

2 days agoby lupusreal

Reinforcement learning is very good with games.

>> In Minecraft, the team used a protocol that gave Dreamer a ‘plus one’ reward every time it completed one of 12 progressive steps involved in diamond collection — including creating planks and a furnace, mining iron and forging an iron pickaxe.

And that is why it is never going to work in the real world: games have clear objectives with obvious rewards. The real world, not so much.

2 days agoby YeGoblynQueenne

I didn't know that Nature did movie promotions.

2 days agoby CodeCompost

Who would have thought you could get your TAS run published in Nature if you used enough hot buzzwords. (they have been using various old-school-definition "artifical intelligence" algorithms for a long time)

https://tasvideos.org/

2 days agoby colechristensen

Minecraft is ubiquitous now.

But I remember the alpha version, and NOBODY knew how to make a pick ax. Humans were also very bad at figuring out these steps.

People were de-compiling the java and posting help guides on the internet.

How to break a tree, get sticks, make a wood pick. In Alpha, that was a big deal for humans also.

2 days agoby FrustratedMonky

https://archive.is/XutGu

2 days agoby N-Krause

Slightly off-topic from the article itself, but… does anyone else feel like Nature’s cookie banner just never goes away? I have vivid memories of trying to reject cookies multiple times, eventually giving up and accepting them just to get to the article only for the banner to show up again the next time I visit. I swear it’s giving me déjà vu every single visit.. Am I the only one experiencing this, or is this just how their site works?

2 days agoby ljdtt

Could this perform better by having the internal representation of Minecraft instead of raw pixels?

It seems rather tenuous to keep pounding on 'training via pixels' when really a game's 2D/3D output is an optical trick at best.

I understand Sergey Brin/et al had a grandiose goal for DeepMind via their Atari games challenge - but why not try alternate methods - say build/tweak games to be RL-friendly? (like MuJoCo but for games)

I don't see the pixel-based approach being as applicable to the practical real world as say when software divulges its direct, internal state to the agent instead of having to fake-render to a significantly larger buffer.

I understand Dreamer-like work is a great research area and one that will garner lots of citations for sure.

1 day agoby textlapse

Finally a use case for AI

2 days agoby protocolture

Isn't this DeepMind achievement from 2023?

2 days agoby Xelynega

Attempting to train this on a real workload I converted over the weekend after, "step" 8M~ so far and rarely scores above 5% and most are 0% but has scored 60% once 7M~ steps ago.

Adding more than 1 GPU didn't improve speed but that's pretty standard as we don't have fancy interconnect. Bit annoying they didn't use tensorboard for logging, but overall seems like a pretty cool lib - will leave it a few days and see if it can learn (no other algo has so I dont have much hope).

2 days agoby fine_tune

Pretty impressive. Minecraft’s a complex environment, so for an AI to figure out how to find diamonds on its own shows real progress in learning through exploration — not just pattern recognition.

2 days agoby successful23

There's a YouTube channel that does a lot of videos focused on LLMs in Minecraft:

https://www.youtube.com/@EmergentGarden

I very much like the comparative approach this guy takes looking at how different LLMs fare... including how they interact together. Worth a look.

2 days agoby sbuttgereit

This looks like an article about the recent Nature publication. Was confused at first because DreamerV3 is a couple of years old now

2 days agoby theOGognf

It’s still being a stochastic parrot. Now it’s just parroting the human creativity and imagination so I’m still not impressed.

If all you’re going to do is parrot things like human consciousness or human ingenuity then I will never be impressed so long that it’s just parroting.

2 days agoby ninetyninenine

They write: "Below, we show uncut videos of runs during which Dreamer collected diamonds."

... but the first video only shows the player character digging downwards without using any tools and eventually dying in lava. What?

2 days agoby jonathanyc

How robust is this?

Isn't something like finding dimonds in minecraft something that old-school AI could already do decently?

2 days agoby camel-cdr

Isn't "masters" when you build a working copy of Minas Tirith or something like that?

2 days agoby nottorp

So can i and no one needed to teach me either, but you dont see nature writing articles on it...

2 days agoby _vere

I guess we can look forward

to a bright future

where we focus 100% on work

and AI will play our games

2 days agoby fxtentacle