Week 5, Day 1 (Monday, February 3)

Discussion of answers to Friday’s in-class exercise

Walking through the table we used on Friday.

Read Latencies observed in a BigTable service benchmark

Source: Table 2, p. 78 of The Tail at Scale, Copyright ACM 2013.

Design options for distributed systems (From There Is No Getting Around It: You Are Building a Distributed System)

The categories of analysis:

The standard design questions for any system also apply (versioning, upgrades, …).

Automating failover (From There Is No Getting Around It: You Are Building a Distributed System)

Many systems have a “leader” instance that assigns work to the other instances.

What happens when the “leader” fails? Do you bring up a new leader automatically or have the operations staff do it?

Guide to readings for next class

Carried over from Friday.

Read There Is No Getting Around It: You Are Building a Distributed System, from Platform Components (p. 68) up to and including Platform Usage Collection (p. 69).

Two key points from these sections:

  1. There are many components of these systems that are not glamorous nor “complicated” but that are necessary for the system.
  2. How do these components have to be designed to make them scalable?