The hosting market today is a very competitive place, providing various services including support and services that manage your service. Whether you are just hosting a website on a shared host, VMs, containers or even dedicated rentals, due to competitive pricing, everyone overprovisions except if mentioned which are typically more expensive. A dedicated VM service would cost a lot more than a regular VM service due to overprovisioning on most other services.
Overprovisioning is nothing new, only it is allocating more resources than there are physically available. A lot of hypervisors and software will not allow you to provision more resources than physically available, but in total will through multiple VMs such as provisioning 5GB of ram to 2 VMs each when you only have 8GB of ram.
It is important to understand how to provision resources and what resources are more sensitive to overprivisioning and which are to overuse. The 4 most common resources in provisioning are CPU, RAM, Storage and devices (such as networking, NICs, be it virtual or physical). Sharing vs physical allocation also has its advantages and pitfalls such as with SR-IOV for network cards.
CPU is generally considered a bad idea to overprivison however i will argue that it is the opposite. You can over provision a CPU as much as you want as long as the total average CPU usa stays within a defined % based on how responsive you want the system to be. From my testing, a 30% average is considered the maximum average CPU before a system becomes less responsive and takes longer to respond to or process requests. However this also depends entirely on what you are running, and in general as long as a CPU isn’t fully pegged and there is no fighting for CPU time, you can provision as many VCPUs or CPU time as you like to VMs, applications and the like. In the past priority was assigned while today CPU time and credits are used so an application’s CPU behaviour means earning/spending credits based on its use on deciding which application gets priority in using more CPU when it needs to which can lead to a better overall experience in a shared environment. Always provision CPU as much as needed in a shared environment and monitor the average, keeping it within a set threshold. One thing that can be done if the threshold is breached is to move a heavy CPU application to another host with CPU usage lower than the threshold.
RAM is the 1 component you must not over provision. In a VM environment, most modern OS apply a good practice of using all available ram. If there is free RAM available to a modern OS, it will use it for cache, which to the host will consider that cache as actual VM RAM use and thus contribute to the host RAM total use. This is why whether it is VMs or containers, only assign enough RAM for the workload since the host already caches especially like with many modern virtualisation deployments, use virtio for drives which allows the host to act as a disk cache using host RAM which helps in performance while virtio itself eliminates most of the storage protocol overheads and storage handling especially with the use of LVM. If too much ram is provisioned, the host RAM will end up full and host swap will be used. In some cases this may be OK if you have a newer epyc with all of its PCIe lanes filled with the latest DRAM+SLC cache MLC SSDs which in total will have more bandwidth than RAM but at the cost of the drives wearing out faster however in good practice and to prevent significant performanace degradation from using swap on a slower storage. I am aware of the existance of ballooning but from the many times i have tried it, it not only requires the OS being aware of it, but it is considered an active application used in a guest VM that supposedly release memory for applications when they need it and takes more ram when there is ram used for cache. While it may be a good idea on sharing RAM between VMs, i find the implementation buggy and tends to cause crashes in the guest VM so i dont consider this as a valid factor when provisioning RAM.
Storage can be overprivisioned. In the past, virtualisation would use a single large file to represent each VM’s drive but with LVM, direct handling of the drive/storage driver is used (RAW) which is said to be the fastest way as you do not have the filesystem layer to deal with. You can assign more storage than you have but if you write more data than there is space which can happen with LVM, you will lose data or corrupt it which is the most severe consequence. You cannot over store data, only allocate more space than there actually is which is a common practice.
Shared devices can’t be overprovisioned, they can only be assigned in a fix manner (such as with SR-IOV) or shared. This instead puts the burden on the host while in the case of SR-IOV puts the burden on the PCIe device which can help in reducing CPU usage on the host/application and allows for high performance direct user to hardware interation from the application. While directly provisioning devices can be beneficial, the provisioning is limited (only 1 VM can be assigned 1 device. SR-IOV helps make it seem like there are many physical network cards), which can also allow for a different or specific setup such as for networking and network shaping. If the device is shared (like a network card), even for SR-IOV, it can be overused, which if for a network card it is fully utilised either in bus or interface, all traffic using the device will experience performance degradation having to wait.
In summary an overprovisioned CPU will have no effect while an over used CPU will end up slowing down all other applications. An Overprovisioned RAM will slow down everything running, and overprovisioning will overuse RAM. An overprovisioned storage has no effect on anything, but over using storage will cause data loss or data corruption (the worst consequence). You cannot overprovision devices like network cars as the have a limited number of PCIe addressing so its a fixed provision, but you can overuse a shared device that has fixed provisioning which will only cause it to be a bottleneck (such as slow network performance) but does not mean it will affect the host/application/VM running, only that there will be a much higher iowait.