Week 5, Day 1 (Monday, February 3)

Discussion of answers to Friday’s in-class exercise

Walking through the table we used on Friday.

Read Latencies observed in a BigTable service benchmark

Source: Table 2, p. 78 of The Tail at Scale, Copyright ACM 2013.

The categories of analysis:

The standard design questions for any system also apply (versioning, upgrades, …).

Many systems have a “leader” instance that assigns work to the other instances.

In our design for Assignment 2, there can only be on server.py, assigning tasks to the worker.py instances.

What happens when the “leader” fails? Do you bring up a new leader automatically or have the operations staff do it?

Carried over from Friday.

Read There Is No Getting Around It: You Are Building a Distributed System, from Platform Components (p. 68) up to and including Platform Usage Collection (p. 69).

Two key points from these sections:

There are many components of these systems that are not glamorous nor “complicated” but that are necessary for the system.
How do these components have to be designed to make them scalable?