Closest Pair of Points | O(nlogn) Implementation
We are given an array of n points in the plane, and the problem is to find out the closest pair of points in the array. This problem arises in a number of applications. For example, in air-traffic control, you may want to monitor planes that come too close together, since this may indicate a possible collision. Recall the following formula for distance between two points p and q.
We have discussed a divide and conquer solution for this problem. The time complexity of the implementation provided in the previous post is O(n (Logn)^2). In this post, we discuss an implementation with time complexity as O(nLogn).
Following is a recap of the algorithm discussed in the previous post.
1) We sort all points according to x coordinates.
2) Divide all points in two halves.
3) Recursively find the smallest distances in both subarrays.
4) Take the minimum of two smallest distances. Let the minimum be d.
5) Create an array strip[] that stores all points which are at most d distance away from the middle line dividing the two sets.
6) Find the smallest distance in strip[].
7) Return the minimum of d and the smallest distance calculated in above step 6.
The great thing about the above approach is, if the array strip[] is sorted according to y coordinate, then we can find the smallest distance in strip[] in O(n) time. In the implementation discussed in previous post, strip[] was explicitly sorted in every recursive call that made the time complexity O(n (Logn)^2), assuming that the sorting step takes O(nLogn) time.
In this post, we discuss an implementation where the time complexity is O(nLogn). The idea is to presort all points according to y coordinates. Let the sorted array be Py[]. When we make recursive calls, we need to divide points of Py[] also according to the vertical line. We can do that by simply processing every point and comparing its x coordinate with x coordinate of middle line.
本题是前一题的二分法的改进,前一个算法的效率是O(nlgnlgn)
前一遍二分法:http://blog.csdn.net/kenden23/article/details/20737523
为什么这个二分法是O(nlgnlgn)呢?除了使用公式计算之外,也可以这么考虑最坏情况,当距离中点的小于d距离,集中了所有的点,那么每次递归都需要对所有的点进行一次排序,效率是O(nlgn),那么每次二分都需要这么的效率,就要乘以做二分法的lgn效率,那么总效率就是O(nlgnlgn)。
呵呵,不是很正规的分析,不过我觉得这么分析更“聪明”,而且不会有错的。
那么为什么很多书上都说这个算法是O(nlgn)呢?因为大多书本都没有分析计算中间点最小距离的时候会出现最坏情况。
而Geeks这个网站考虑的非常仔细,这也是我也很推崇这个网站的原因之一。
本算法提高效率到O(nlgn),就是省去了中间需要重复排序的步骤。如果熟悉了前面的二分法算法,那么就很容易写出这个程序了。不明白的,先搞懂前面的算法,在来改进成这个算法。