Unlock Your Python Backend Career: Build 30 Projects in 30 Days. Join now for just $54

System Design

System Design Fundamentals

System Reliability

9 Resources

What is System Design?

System Design Performance Metrics

Scalability

Horizontal Scaling (Scaling out)

Vertical Scaling (Scaling up)

System Reliability

System Availability

Availability Vs Reliability

System Efficiency

System Reliability

System Reliability is the probability that a system will perform correctly during a specific time duration. A system is reliable when it adequately follows the defined performance specifications and no repair is required during that period.

It’s obvious that hardware depreciates with time which has an effect on a system's reliability. On the other hand, it’s difficult to measure software reliability; responses to client requests could slow down but still be accurate.

A reliable system should continue working even when the software or hardware components fail. Any failing component should be replaced immediately with a healthy one to ensure the completion of a requested task.

For instance, in a large online store like Amazon, where one of the primary requirements is that a transaction should never be canceled due to the failure of the node running the transaction.

For example, if a user adds an item to a shopping cart and proceeds to payments, the system is expected not to lose it even if the server carrying the transaction fails. A reliable system should be fault tolerant i.e. detect failures and migrate the transaction task to another redundant server for completion. A resilient system should be able to eliminate every single point of failure.

A common way to measure reliability is by using Mean Time Between Failure(MTBF). MTBF is the average time between system breakdowns which measures the performance of a system.

MTBF is calculated by taking the total time a system is running(uptime) and diving it by the number of failures(downtimes). For instance, if a system is operational for 100 hours, it breaks down two times for 3 hours, and with an addition of 4 hours the MTBF can be calculated as follows:

MTBF = (100hrs - 7hrs)/2 breakdowns = 93 hours/2 breakdowns = 46.5 hours

Whenever you're ready

There are 4 ways we can help you become a great backend engineer:

Unlock Your Python Backend Career: Build 30 Projects in 30 Days. Join now for just $54

System Reliability

Whenever you're ready

The MB Platform

The MB Academy

Join Backend Weekly

Get Backend Jobs

Vertical Scaling (Scaling up)

System Availability