Given a set of *n* points *S* on the real line and a positive integer *k*, design an O() time algorithm to find a subset *P* of *S* with size *k*, such that the total distance from each point in *S* to the nearest point in *P* is minimized.

**Solution**:

We refer the *i*-th point in *S* as *S*[*i*] and use *d*(*x*, *y*) to denote the distance between two points *x* and *y*. The main idea is dynamic programming. First, sort the points in *S* in increasing order. Let *F*[*q*][*j*] be the optimal objective value for points *S*[*j*..*n*] choosing at most *q* points, *G*[*q*][*j*] be the optimal objective value for points *S*[*j*..*n*] choosing at most *q* points, provided that *S*[*j*] is chosen. Hence, we get a recurrence relation *G*[*q*][*j*] = *d*(*S*[*t*], *S*[*j*]) + *F*[*q*-1][*k*] and *F*[*q*][*j*] = *d*(*S*[*t*], *S*[*k*]) + *G*[*q*][*k*]. *F*[*p*][*1*] is the answer. By using this recurrence relation, we get an O() time algorithm.

In order to speed up, let *w*[*j*][*k*] = *d*(*S*[*t*], *S*[*j*]) and *u*[*j*][*k*] = *d*(*S*[*t*], *S*[*k*]). The values in *w* and *u* satisfy the quadrangle inequality, that is *w*[*i*][*k*] + *w*[*j*][*l*] ≤ *w*[*i*][*l*] + *w*[*j*][*k*] for all *i* ≤ *j* ≤ *k* ≤ *l*. Thus, we can speed-up the dynamic programming and get an O(*pn*) time algorithm[1].

If the points is on a tree, there is an O() time algorithm[2].

Note: p-median on plane is NP-hard. It also can be shown that even we can use points outside *S*, we cannot get a better solution.

**解法**：

我們用*S*[*i*]來表示*S*內的第*i*個點，用*d*(*x*, *y*)來表示*x*與*y*之間的距離。 解法是基於動態規劃。首先把*S*遞增排序。令*F*[*q*][*j*]表示子問題在*S*[*j*..*n*]的點中選最多*q*個點的最佳解之值。令*G*[*q*][*j*]表示子問題在*S*[*j*..*n*]的點中選最多*q*個點的最佳解之值且*S*[*j*]是被選的。 因此，我們得到一遞迴關係式*G*[*q*][*j*] = *d*(*S*[*t*], *S*[*j*]) + *F*[*q*-1][*k*] and *F*[*q*][*j*] = *d*(*S*[*t*], *S*[*k*]) + *G*[*q*][*k*]。*F*[*p*][*1*]即為所求。利用此遞迴關係式，總時間複雜度為O()。

有一個技巧可以加速，令*w*[*j*][*k*] = *d*(*S*[*t*], *S*[*j*])，再令*u*[*j*][*k*] = *d*(*S*[*t*], *S*[*k*])。*w*和*u*的值滿足四邊形不等式*w*[*i*][*k*] + *w*[*j*][*l*] ≤ *w*[*i*][*l*] + *w*[*j*][*k*]對於所有的*i* ≤ *j* ≤ *k* ≤ *l*。因此，我們可以加速動態規劃，得到一個 O(*pn*) 的演算法。

Reference:

[1] R. Hassin, A. Tamir, “Improved complexity bounds for location problems on the real line,” Operations Research Letters Volume 10, Issue 7, October 1991, Pages 395–402.

[2] Arie Tamir, “An O(pn2) algorithm for the p-median and related problems on tree graphs,” Operations Research Letters Volume 19, Issue 2, August 1996, Pages 59–64