Not sure where the OP to the comment I meant to reply to is, but I'll just add this here.
> I suspect the path to general intelligence is not that, but we'll see.
I think there's three things that a 'true' general intelligence has which is missing from basic-type-LLMs as we have now.
1. knowing what you know. <basic-LLMs are here>
2. knowing what you don't know but can figure out via tools/exploration. <this is tool use/function calling>
3. knowing what can't be known. <this is knowing that halting problem exists and being able to recognize it in novel situations>
(1) From an LLM's perspective, once trained on corpus of text, it knows 'everything'. It knows about the concept of not knowing something (from having see text about it), (in so far as an LLM knows anything), but it doesn't actually have a growable map of knowledge that it knows has uncharted edges.
This is where (2) comes in, and this is what tool use/function calling tries to solve atm, but the way function calling works atm, doesn't give the LLM knowledge the right way. I know that I don't know what 3,943,034 / 234,893 is. But I know I have a 'function call' of knowing the algorithm for doing long divison on paper. And I think there's another subtle point here: my knowledge in (1) includes the training data generated from running the intermediate steps of the long-division algorithm. This is the knowledge that later generalizes to being able to use a calculator (and this is also why we don't just give kids calculators in elementary school). But this is also why a kid that knows how to do long division on paper, doesn't seperately need to learn when/how to use a calculator, besides the very basics. Using a calculator to do that math feels like 1 step, but actually it does still have all of initial mechanical steps of setting up the problem on paper. You have to type in each digit individually, etc.
(3) I'm less sure of this point now that I've written out point (1) and (2), but that's kinda exactly the thing I'm trying to get at. Its being able to recognize when you need more practice of (1) or more 'energy/capital' for doing (2).
Consider a burger resturant. If you properly populated the context of a ChatGPT-scale model the data for a burger resturant from 1950, and gave it the kinda 'function calling' we're plugging into LLMs now, it could manage it. It could keep track of inventory, it could keep tabs on the employee-subprocesses, knowing when to hire, fire, get new suppliers, all via function calling. But it would never try to become McDonalds, because it would have no model of the the internals of those function-calls, and it would have no ability to investigate or modify the behaviour of those function calls.