Scaling

#system design #programming #coding interview

Scaling a Web Service

Terminology

  • Client: web browsers or mobile applications.
  • Server: machines that host back-end code, not necessarily where data live.
  • Database: data storage that might be on a different machine than the server.

Types of Scaling

  • Vertical Scaling refers to adding more CPUs or RAM to your server.
    • It is hard since you cannot add unlimited CPUs to your server due to hardware or operating system limits.
    • If the server goes down, the service goes down.
  • Horizontal Scaling means adding more servers to your pool, just like adding pods into the Kubernetes cluster. ^e95948

Areas Where Scaling is Possible

Data

Database Choice

Relational database is good for most cases, but some use cases require non-relational databases:

  • Low latency
  • Unstructured data
  • Only operations are serialization or deserialization
  • Massive amount of data
Database Replication
  • Better performance and availability by having distributed nodes, allowing parallel processing.
  • Reliability since data is replicated.
Database Scaling

Scaling out aka [[]] by sharding. However, there are some issues that make sharding difficult:

  • Re-shading: due to failure or shard exhaustion, it might be required to re-shard data;
  • Join operations: join cross shards might be very difficult thus queries need to be de-normalized to query each shard separately;

Serving

Load-balancer
  • Splitting traffic
  • Scaling out [[]] or in by controlling server resources
Cache

Caching provides faster response for the requests for the same content. There are some considerations when to use cache:

  • Frequent reads but infrequent writes
  • Expiration policy: not too short and not too short
  • Consistency
  • Distributed cache tier to prevent single point failure
  • Eviction: Least Recent Used (LRU), Least Frequently Used (LFU) or First In First Out (FIFO)
CDN

Content Delivery Network, a special case of [[]] tier, which is usually geologically distributed. They are designed for either static or dynamic files. Considerations:

  • Cost
  • Expiration policy
  • Fallback when failure
  • Eviction or invalidation
Stateless Web Service
  • Stateful: remembering client data from one request to next
  • Stateless: no such information is required in the request, but is stored in a share storage accessible from all servers
Data Center

Data center are often used to facilitate geolocation-based service. It can host servers, databases, caches and [[]] can be used to route user requests to different data centers.

  • Redirection
  • Synchronization
  • Testing and deployment
Message Queue

Message queue can be used to handle asynchronous communication between users and servers.

Logging, Monitoring, Telemetry, and Automation

Good practices good continuous integration, deployment and debugging.