IMHO there are some serious problems here that won't relate to many situations, and is not really "waste" in the way claimed and will actually probably result in greater spends.
> Memory waste: request - actual usage [0]
Memory "requests" are hints to the kube-scheduler for placement, not a target for expected usage.
> # Memory over-provisioning: limit > 2x request [1]
Memory limits are for enforcement, typically when to call the OOM killer
Niether placement nor oomkilling limits should have anything to do with normal operating parameters.
> The memory request is mainly used during (Kubernetes) Pod scheduling. On a node that uses cgroups v2, the container runtime might use the memory request as a hint to set memory.min and memory.lo [2]
By choosing to label the delta between these two as "waste" you will absolutely suffer from Goodhart's law and you will teach your dev team to not just request, but allocate memory and don't free it so that they can fit inside this invalid metric's assumptions.
It is going to work against the more reasonable goals of having developers set their limits as low as possible without negative effects, while also protecting the node and pod from memory leaks, while still gaining the advantages of over-provisioning, which is where the big gains are to be made.
[0] https://github.com/WozzHQ/wozz/blob/main/scripts/wozz-audit....
[1] https://github.com/WozzHQ/wozz/blob/main/scripts/wozz-audit....
[2] https://kubernetes.io/docs/concepts/configuration/manage-res...