Reconciling High Server Utilization and Sub-millisecond Quality-of-Service

内容隐藏

论文原址： http://csl.stanford.edu/~christos/publications/2014.mutilate.eurosys.pdf

In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads.

The additional workloads can interfere with resources such as processing cores, cache space, memory or I/O bandwidth

The goal of this work is to investigate if workload colocation and good quality-of-service for latency-critical services are fundamentally incompatible in modern systems, or if instead we can reconcile the two

混部引起的Qos下降通常有三种原因：

queuing delay ： increases in queuing delay due to interference on shared resources
scheduling delay ： long scheduling delays when timesharing processor cores
load imbalance ： poor tail latency due to thread load imbalance

论文中以memcached为例，分别从这三反面详细分析了时延敏感服务在混部场景下Qos是如何被影响的。

1. Queuing delay

What: Queuing delay occurs due to coincident or rapid request arrivals，Interference from co-located workloads impacts queuing delay by increasing service time, thus decreasing service rate. Even if the co-located workload runs on separate processor cores, its footprint on shared caches, memory channels, and I/O channels slows down the service rate for the latency critical workload.

How: Thus, we propose that load be provisioned to services in an interference-aware manner, that takes into account the reduction in throughput that a service might experience when deployed on servers with co-located workloads.

2. Scheduling delay

What: 调度延迟主要有两方面：

scheduler wait time
context switch latency

Linux内核默认CFS调度器最大的问题是： CFS’s wakeup placement algorithm allows sporadic tasks to induce long wait time on latency-sensitive tasks like memcached.

How : F ortunately, there are several strategies one can employ to mitigate this wait time for latency-sensitive services, including

adjusting task share values in CFS,
utilizing Linux’s POSIX real-time scheduling disciplines instead of CFS, or
using a general purpose scheduler with support for latency-sensitive tasks, like BVT
CPU Bandwidth Limits to Enforce Fairness

3. Load imbalance

What: A latency-sensitive service’s vulnerability to load imbalance can be easily ascertained by purposefully putting it in a situation where threads are unbalanced

One solution to this problem is particularly straight-forward and effective: threads can be pinned explicitly to distinct cores, so that Linux can never migrate them on top of each other

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

成功，源于对美学的执著追求

Reconciling High Server Utilization and Sub-millisecond Quality-of-Service

1. Queuing delay

2. Scheduling delay

3. Load imbalance

发表回复取消回复

成功，源于对美学的执著追求

1. Queuing delay

2. Scheduling delay

3. Load imbalance

发表回复 取消回复

发表回复取消回复