The right way to monitor "system calls" on Windows is with Detours. It's a flexible framework from Microsoft for hooking DLL calls, which is the right way to do it, as Windows doesn't have a stable or documented syscall API so everything goes via the DLLs except a things you won't find in build systems like game anticheats. Detours also lets you change the behavior of the calls.
The fun thing is that this doesn't even cover all of the problems involved. Studying Gradle is a good way to flesh out an understanding of why build systems are hard because it's the only one I'm aware of that tries to solve every problem simultaneously, which is probably why so many people use it even though it doesn't inspire much love.
So, build systems often try to provide:
• Package management and library dependency resolution. This is a huge pile of problems that nonetheless people really want solutions to, and Gradle put a lot of work into it.
• Portability. Most build systems don't even bother and just pretend everything is UNIX. Others theoretically can work on Windows but don't work well in practice. Maven/Gradle builds are on the other hand always portable.
• User interface. A lot of projects (ab)use their build system as a general scripting system for misc tasks like driving deployments. This opens up complex UI issues, for example, it's common to have tasks in a task graph that are meant to be invoked by the user and others which are purely 'internal'. Gradle tackles this, somewhat. Also: do you have IDE integration for writing your build scripts? Is your build language statically typed? Etc.
• Composition of build logic. JVM build systems make it easy to share build logic via plugins, and people do it a lot, so that then introduces new problems because now your build system itself has dependencies that have to be versioned, downloaded and integrated, possibly with conflict resolution and handling versions that aren't stable (i.e. SNAPSHOT versions that don't refer to a stable binary).
• Task skipping. Most build systems assume that if target X is rebuilt and target Y depends on X, then Y must also be rebuilt. But in a system that supports dynamic linking this is usually only true if the interface of X changed. So some build systems, like Gradle, can optimize builds by skipping downstream recompiles if the ABI of a module didn't change.
• Unit test execution and acceleration, e.g. can your build system do sharded test execution?
• Incremental configuration evaluation. Gradle has got really complex because it's trying to implement a kind of fine-grained dataflow system on top of imperative languages, and it's trying to do that because some companies have build graphs so massive that it's very painful to re-compute the entire graph any time the build system loads or is modified. So they try and make everything as lazy as possible.
Lots of difficult design decisions lurk there.