It’s better to have infinite scalability and not need it, than to need infinite scalability and not have it. Andrew Clay Shafer
As the server load increases, and it’s time to add hardware resources to your infrastructure, you have two approaches: increase your resources vertically or horizontally.
Vertical scalability is the most intuitive, it means adding resources to a single server (by adding RAM, boosting its CPU …). It’s simple but the cost of material capacity is rapidly exponential.
This solution has limits, You can probably multiply your server’s capabilities by 2, maybe even by 10, but not by 100.
Also usually, vertical scalability requires downtime while new resources are being added.
But this remains in most cases the most pragmatic solution, and in particular for applications in production for many years.
Horizontal scalability is like adding new servers doing the same type of task.
This allows you to use only standard servers with commodity hardware. But the software implications are quickly important.
The application must be stateless which means that the data is stored in another service (in your database, in a distributed cache…). It is ideal, You should design your development with this goal.
Thanks to this, in case of hardware failure on one instance, other instances will be able to absorb the load with a minimum service interruption for the user.
Overall, Vertical scalability involves using a computer that offers many possibilities for adding parts, on which it is possible to put a large amount of memory, many processors, several motherboards and many hard drives.
On the other hand, Horizontal scalability involves adding computers to cope with increased demand for a service. The most common method is load allocation by the use of a cluster of servers.