Logging (Week 12, Monday, March 31)

Logging defined

Basic idea

From Python 3 logging tutorial:

python logging.debug('This message should go to the log file') logging.info('So should this') logging.warning('And this, too')

python 2010-12-12 11:41:42,612 DEBUG:root:This message should go to the log file 2010-12-12 11:41:43,015 INFO:root:So should this 2010-12-12 11:42:35,756 WARNING:root:And this, too

  1. When you record it?
  2. What information do you record?
  3. Do you have levels (DEBUG, INFO, WARNING)?
  4. When do you turn levels on and off?
  5. How do you analyze the logs?

When do you use it?

From Python 3 tutorial

When to use logging
Task you want to performThe best tool
Display output for ordinary use of a command-line programprint()
Report events from normal operationlogging.info() or logging.debug()
Issue a warning regarding an eventlogging.warning() if there is nothing the application can do
Report an error from a specific eventRaise an exception
Report suppression of an error in a long-running processlogging.error(), logging.exception(), or logging.critical(), as appropriate

Importance of logging

From 20 Obstacles to Scalability, p. 58:

Number 7: Insufficient monitoring and metrics

Number 10: Insufficient logging

From On Designing and Deploying Internet-scale Services, pp. 231–232:

Log everything all the time

From Log Everything All the Time.

For highly-available applications

2014-02-12 11:41:42,612 root:QX3567187:Resize from S. Lee of 'my-vacation-july-24-444.jpg' started 2014-02-12 11:41:43,015 root:QX3567187:Resize saved in S3 entry 'lee-mvj24-3617846.jpg' 2014-02-12 11:41:43,212 root:QX3567187:Resize sent to instance EC2-Q347HN for 100 by 100 resize 2014-02-12 11:42:35,756 root:QX3567187:Resize completed by EC2-Q347HN

Keeping it efficient

Set up fast queue between high-priority worker process and low-priority logging process

Any object should be easily dumped to the log

Analyzing the logs

Products such as loggly integrate logs from multiple sources and analyze them.

Guide to reading for next class

Read the following two short sections from F1: A Distributed Database that Scales:

  1. Section 1: Introduction (pp. 1068–1069, not including “2. Basic Architecture”).

  2. Section 10: Latency and Throughput (p. 1078, not including “11. Related Work”).

Key points: Most of the paper is concerned with database topics that are outside the scope of this course. However, the two sections I selected respond to two themes of the course: