# LeetCode: Median of Two Sorted Arrays

By | 2012 年 10 月 4 日

Updated Java version at 2018-07-14

## Question

Median of Two Sorted Arrays
There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

## Analyze

This is a hard question, when I met this question again after 4 years, it still takes much time to solve. But this time, with a more clear thought on it.

This question needs find 1 or 2 number(s) in the middle of 2 sorted array and has strict time limit on it. At the first thought, 2-way divide will be the solution. But how to apply it, is a big challenge.

First transform this question to another more general question will help a lot: find the k-th number in two sorted array. Then for this Leetcode question, k is (nums1.size + nums2.size) / 2 for odd total. For even total, need find k-th and (k+1)-th number.

Then there are too many options on this question: divide arrays by their own middle number (too complex go from here, have 4 parts, conditions to determine which part to drop is very hard with edge cases); Or divide k to k/2 and k-k/2, more simple, less edge cases, but need consider how to handle the situation when one array is shorter than k/2; for even total number, find n/2 and n/2 + 1 separately or at a time is another consideration.

At last, I worked out the solution by divide k and find k and k+1 at once (for even total numbers). For this solution, some facts are important:
Let’s say those 2 arrays are A and B. A has n numbers and B has m numbers.
So:
A = a0 , a1 , … , an-1
B = b0 , b1 , … , bm-1
And for simplicity, let’s only consider total number is odd, so `k = (n + m)/2` .

Then divide k as equally as we can to `ka = k/2` and `kb = k - k/2`, it’s apparent that `kb >= ka` (you can get this conclusion by check k from 0). Then if we use ka and kb to segment those two arrays we got:
a0 , a1 , … , aka-1 | aka , … , an-1
b0 , b1 , … , bkb-1 | bkb , … , bm-1

Then we can either drop a0 , a1 , … , aka-1 or b0 , b1 , … , bkb-1 from our search. Try my best to explain this idea as clear as possible in the following analyze:

From here, compare the aka-1 and bkb-1.
If aka-1 <= bkb-1 , then we can make sure the beginning of the array A is included in the first k elements of the whole sorted array.

Why? Since bkb-1 is equal or bigger than aka-1, then the tail of Array B and the tail of Array A are behind aka-1 for sure. The only possible items may be placed before aka-1 is the items in the front of Array B: b0 , b1 , … , bkb-2. But even in that case, there are only ka + kb – 1 < k numbers. There is not enough number can push aka-1 out of the first k numbers of the whole sorted array.

Then, with this conclusion that a0 , a1 , … , aka-1 are included in the first k elements of the whole sorted array, and we are looking for the k and maybe k+1 elements. So they are unrelated to our task, thus we can exclude them from our search. Then we can further solve the problem by finding (k-ka)-th element in array aka , … , an-1 and b0 , … , bm-1

And if in other case that aka-1 > bkb-1, we can exclude the beginning of B b0 , b1 , … , bkb-1 from our search. And we can find the (k-kb)-th element in the arrays a0 , … , an-1 and bkb , … , bm-1

In this manner we can find the target in O(log(m + n)) time limit.

And for the case that total numbers are even, when we found k-th element, just do another compare of the heads of array, then we can get the (k+1)-th number. It’s simple, but will confuse a lot if we take this into consideration of the whole process at the beginning.

And the edge cases are simple, won’t explain them here.

## Codes

### Java

```class Solution {
private int[] nums1;
private int[] nums2;
private int offset1;
private int offset2;
private int k;
private boolean plusOne;

public double findMedianSortedArrays(int[] nums1, int[] nums2) {
int totalNum = nums1.length + nums2.length;
if (totalNum == 0) {
System.out.println("Both arrays are empty");
return 0;
}
this.nums1 = nums1;
this.nums2 = nums2;
this.offset1 = 0;
this.offset2 = 0;
this.k = (totalNum + 1) / 2;
this.plusOne = totalNum % 2 == 0;

return findK();
}

private double pickSmallest() {
if (offset1 == nums1.length) {
return nums2[offset2++];
} else if (offset2 == nums2.length) {
return nums1[offset1++];
}
return nums1[offset1] <= nums2[offset2] ? nums1[offset1++] : nums2[offset2++];
}

private double findK() {
while (true) {
// System.out.println(String.format("offsets:%d/%d; k:%d", offset1, offset2, k));
int len1 = nums1.length - offset1;
int len2 = nums2.length - offset2;

if (len1 == 0) {
double result = nums2[offset2 + k-1];
if (plusOne) {
result = (result + nums2[offset2 + k]) / 2.0;
}
return result;
} else if (len2 == 0) {
double result = nums1[offset1 + k-1];
if (plusOne) {
result = (result + nums1[offset1 + k]) / 2.0;
}
return result;
} else if (k == 1) {
double result = pickSmallest();
if (plusOne) {
result = 0.5 * (result + pickSmallest());
}
return result;
}

int div1 = k / 2;
if (div1 > len1) {
div1 = len1;
}
int div2 = k - div1;
if (div2 > len2) {
div2 = len2;
div1 = k - div2;
}
// System.out.println(String.format("divs:%d/%d", div1, div2));

if (nums1[offset1 + div1 - 1] <= nums2[offset2 + div2 - 1]) {
offset1 += div1;
k -= div1;
} else {
offset2 += div2;
k -= div2;
}
}

}
}
```

### C++ version

```class Solution {
public:
inline int findFirstAndPop(vector<int>& nums1, vector<int>& nums2, int& s1, int& s2) {
if(nums2.size() <= s2 || nums1.size() > s1 && nums1[s1] <= nums2[s2]) {
return nums1[s1++];
} else {
return nums2[s2++];
}
}
double findK(int k, int kn, vector<int>& nums1, vector<int>& nums2, int s1 = 0, int s2 = 0) {
// find the short array
vector<int> *pShort, *pLong;
int sShort, sLong;
if (nums1.size() - s1 <= nums2.size() - s2) {
pShort = &nums1;
pLong = &nums2;
sShort = s1;
sLong = s2;
} else {
pShort = &nums2;
pLong = &nums1;
sShort = s2;
sLong = s1;
}
vector<int> &arrShort = *pShort;
vector<int> &arrLong = *pLong;
// edge case 1, empty short array
if (arrShort.size() == sShort) {
return (arrLong[k + sLong] + arrLong[kn + sLong]) / 2.0;
}
// edge case 2
if (k == 0) {
double sum = findFirstAndPop(arrShort, arrLong, sShort, sLong);
if (k == kn) {
return sum;
}
sum += findFirstAndPop(arrShort, arrLong, sShort, sLong);
return sum / 2.0;
}

// divide k
int ks = k / 2;
int kl = k - ks;
if (ks > arrShort.size() - sShort) {
ks = arrShort.size() - sShort;
kl = k - ks;
} else if (ks == 0) {
ks = 1;
}

// compare pivots
if (arrShort[sShort + ks - 1] <= arrLong[sLong + kl - 1]) {
// remove head of short array
return findK(k - ks, kn - ks, arrShort, arrLong, sShort + ks, sLong);
} else {
// remove the head of long array
return findK(k - kl, kn - kl, arrShort, arrLong, sShort, sLong + kl);
}
}
double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
int total = nums1.size() + nums2.size();
if (total == 0) return 0;
int k = total / 2;
if (total % 2) {
return findK(k, k, nums1, nums2);
} else {
return findK(k - 1, k, nums1, nums2);
}
}
};
```

### Ruby version

```# @param {Integer[]} nums1
# @param {Integer[]} nums2
# @return {Float}
def remove_first(nums1, nums2)
(nums2.empty? || nums1.size > 0 && nums1.first <= nums2.first) ? nums1.shift : nums2.shift
end

def find_kth(k, knext, nums1, nums2)
# got short and long arr
short_arr, long_arr = (nums1.size <= nums2.size ? [nums1, nums2] : [nums2, nums1])

# edge case 1, has an empty array
if short_arr.empty?
return (long_arr[k] + long_arr[knext]) / 2.0
end

# edge case 2, find the first 1 or 2 number(s)
if k == 0
sum = remove_first(nums1, nums2)
return sum if knext == k
sum += remove_first(nums1, nums2)
return sum / 2.0
end

# edge case 3, add this to void kl becomes 0.
if k == 1
remove_first(nums1, nums2)
return find_kth(k-1, knext-1, nums1, nums2)
end

# devide k by 2
kl = k/2
ks = k - kl
# kl will always <= ks, try apply ks to short arr, apply large length on short arr has most possiblity to empty one arr.
if ks > short_arr.size
ks = short_arr.size
kl = k - ks
end
# decide which arr to cut:
if short_arr[ks-1] <= long_arr[kl - 1]
# cut short
find_kth(k-ks, knext-ks, short_arr[ks..-1], long_arr)
else
# cut long
find_kth(k-kl, knext-kl, short_arr, long_arr[kl..-1])
end
end

def find_median_sorted_arrays(nums1, nums2)
total_size = nums1.size + nums2.size
return 0 if total_size == 0
k = total_size / 2
if total_size % 2 == 0
find_kth(k - 1, k, nums1, nums2)
else
find_kth(k, k, nums1, nums2)
end
end
```

## Old version at 2012

先说临界情况

1. A为空或者B为空
直接在非空数组中找第k大的数即可。O(1)
2. 找最小的数，k==0的情况，也简单，比较两个数组最开头的元素，谁小就是谁
然后就是比较复杂的情况，假设寻找目标target是下标为k的数。
那么意味着在排好的数组中，在目标数之前，一共有k个比目标更小的数。
将k分成两份，一份在A的前端，一份在B的前端。这里其实将k怎么分配是一个可以讨论的问题，但是平分k可以得到平均最快的效果。
设k = ka + kb，（k是偶数简单，k是奇数的话，剩下那一个随便放在两个数组中哪一个中都可以）

这里可以列出来我们想要的效果：
k=1 —-> ka = 1, kb = 1
k=2 —-> ka = 1, kb = 1
k=3 —-> ka = 1, kb = 1. [+1，表示还有一个元素，可以随意分配在ka或者kb中，只要不越界]
k=4 —-> ka = 2, kb = 2
k=5 —-> ka = 2, kb = 2. [+1]
已经可以看出来规律了，这个造成了下面代码中比较复杂的部分，这些细节消耗的时间不少啊。

然后就是主要逻辑

1. 如果A[ka-1] >= B[kb-1]
说明在B前端的kb个数中，不可能出现我们要寻找的目标。
为什么呢？
假如A一共有m个数，B一共有n个数。
那么target(下标是k)后面有且只有n + m – 1 – k个数；
但是B[kb-1]在B中已经排在n – kb个数之前，加上A中的m – ka + 1(A[ka-a])，也就是说在排序后的整体数组中排在B[kb-1]之后的数，至少有n – kb + m – ka + 1 = n + m – k + 1个。
由于n + m – k + 1 > n + m – k – 1，所以B前端kb个数都不可能是target。
所以此时可以将问题转化为，在A[0,…,m-1]和B[kb,…,n-1]中，寻找下标为k – kb的数。
2. 否则，A[ka-1] < B[kb-1] 同上，可以剔除A前端的ka个数。

```class Solution {
public:
double midianValueOnArray(int Arr[], int si, bool isOdd) {
if (isOdd)
return Arr[si];
else {
return (Arr[si] + Arr[si + 1]) / 2.0;
}
}
double midianValueOnArrays(int A[], int m, int ms, int B[], int n, int ns, bool isOdd){
int sorted[2];
for(int i = 0 ; i < 2; ++i) {
if(ms < m) {
if (ns < n) {
sorted[i] = (A[ms] <= B[ns] ? A[ms++] : B[ns++]);
} else {
sorted[i] = A[ms++];
}
} else {
sorted[i] = B[ns++];
}
if (isOdd) {
return sorted[0];
}
}
return 0.5 * (sorted[0] + sorted[1]);
}
double findMedianSortedArrays(int A[], int m, int B[], int n) {
if(m + n <= 0) return 0;
//find the number at this index from ms and ns
int target = (m + n - 1) / 2;
//if isOdd return val[target] is ok
// else return (val[target] + val[target + 1]) / 2
bool isOdd = (m + n) % 2;

int ms = 0;//current start index in A
int ns = 0;//current start index in B
while (true) {
//if A or B is empty just anotherArray[target] is the target
if(m - ms <= 0) {//A is empty
return midianValueOnArray(B, ns + target, isOdd);
} else if(n - ns <= 0) {//B is empty
return midianValueOnArray(A, ms + target, isOdd);
}
//A and B are not empty
if(target <= 0) {
return midianValueOnArrays(A,m,ms,B,n,ns,isOdd);
}
//divide the numbers smaller than val[target]
int divideCount = target > 1 ? (target - 2) / 2 : 0;
int plus = target > 2 ? target % 2 : 0;
int mm = ms + divideCount;
if(mm >= m) mm = m - 1;
int nm = ns + divideCount;
if(nm >= n) nm = n - 1;
if(n - nm >= m - mm)
nm += plus;
else
mm += plus;
if(A[mm] >= B[nm]) {
//in this case, the numbers at the beginning of B is impossiblely contain val[target]
target -= (nm - ns + 1);
ns = nm + 1;
} else {
target -= (mm - ms + 1);
ms = mm + 1;
}
}
}
};
```

## 3 thoughts on “LeetCode: Median of Two Sorted Arrays”

1. traveller

方法很不错～感觉比leetcode上面从两头减的方法要好点，特别是在有个数组很小的时候。

ps，标题打错了，medium….