When the number of requests in an application increases, it can overload a server which affects system performance.
A single server has limited throughput and resources.
For example, an online marketplace like Amazon. During Black Fridays or the Christmas season, it experiences an unusual surge in traffic. It’s only a matter of seconds before the server gets overloaded, therefore, there is a need to scale to effectively handle the increased demand.
Scaling can be done in two ways, vertically or horizontally. In order to scale horizontally, there is a need for a load balancer.
A load balancer is a device that is used to distribute application traffic across a number of servers. It improves the overall performance of a system by distributing the traffic to different servers, therefore, decreasing the burden on a single server.
A load balancer sits between clients and servers. It routes clients’ requests between servers, ensuring that no single server is overworked which could make an application unavailable and unreliable.