p-median on a Real Line

Given a set of n points S on the real line and a positive integer k, design an O(p n^2 ) time algorithm to find a subset P of S with size k, such that the total distance from each point in S to the nearest point in P is minimized.

Solution:

We refer the i-th point in S as S[i] and use d(x, y) to denote the distance between two points x and y. The main idea is dynamic programming. First, sort the points in S in increasing order. Let F[q][j] be the optimal objective value for points S[j..n] choosing at most q points, G[q][j] be the optimal objective value for points S[j..n] choosing at most q points, provided that S[j] is chosen. Hence, we get a recurrence relation G[q][j] = \min_{j < k \leq n+1} \sum_{t=j}^{k-1} d(S[t], S[j]) + F[q-1][k] and F[q][j] = \min_{j \leq k \leq n} \sum_{t=j}^k d(S[t], S[k]) + G[q][k]. F[p][1] is the answer. By using this recurrence relation, we get an O(pn^2) time algorithm.

In order to speed up, let w[j][k] = \sum_{t=j}^k d(S[t], S[j]) and u[j][k] = \sum_{t=j}^k d(S[t], S[k]). The values in w and u satisfy the quadrangle inequality, that is w[i][k] + w[j][l] ≤ w[i][l] + w[j][k] for all i ≤ j ≤ k ≤ l. Thus, we can speed-up the dynamic programming and get an O(pn) time algorithm[1].

If the points is on a tree, there is an O(pn^2) time algorithm[2].

Note: p-median on plane is NP-hard. It also can be shown that even we can use points outside S, we cannot get a better solution.

解法

我們用S[i]來表示S內的第i個點,用d(xy)來表示xy之間的距離。 解法是基於動態規劃。首先把S遞增排序。令F[q][j]表示子問題在S[j..n]的點中選最多q個點的最佳解之值。令G[q][j]表示子問題在S[j..n]的點中選最多q個點的最佳解之值且S[j]是被選的。 因此,我們得到一遞迴關係式G[q][j] = \min_{j < k \leq n+1} \sum_{t=j}^{k-1} d(S[t], S[j]) + F[q-1][k] and F[q][j] = \min_{j \leq k \leq n} \sum_{t=j}^k d(S[t], S[k]) + G[q][k]。F[p][1]即為所求。利用此遞迴關係式,總時間複雜度為O(pn^2)。

有一個技巧可以加速,令w[j][k] = \sum_{t=j}^k d(S[t], S[j]),再令u[j][k] = \sum_{t=j}^k d(S[t], S[k])。wu的值滿足四邊形不等式w[i][k] + w[j][l] ≤ w[i][l] + w[j][k]對於所有的i ≤ j ≤ k ≤ l。因此,我們可以加速動態規劃,得到一個 O(pn) 的演算法。

Reference:

[1] R. Hassin, A. Tamir, “Improved complexity bounds for location problems on the real line,” Operations Research Letters Volume 10, Issue 7, October 1991, Pages 395–402.

[2] Arie Tamir, “An O(pn2) algorithm for the p-median and related problems on tree graphs,” Operations Research Letters Volume 19, Issue 2, August 1996, Pages 59–64

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s