分治由如下三个模块组成
分治的时间复杂度分析
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ a \cdot T(n/b) + f(n), & n > 1 \end{cases} \end{align*} $$
递推式的求解方法:代入法、递推树、主方法
def merge_sort(a, l, r): if l < r: m = int((l+r)/2) # 取中间点分开 merge_sort(a, l, m) merge_sort(a, m+1, r) merge(a, l, m, r)
def merge_sort(a, l, r): if l < r: m = int((l+r)/2) # 取中间点分开 merge_sort(a, l, m) merge_sort(a, m+1, r) merge(a, l, m, r)
合并:取两个子数组的最小元素做比较,并将小者取出
def merge(a, l, m, r): # 子数组的长度 n1, n2 = m - l + 1, r - m # 创建临时数组 L, R = [0] * n1, [0] * n2 # 拷贝数据到临时数组L for i in range(0, n1): L[i] = a[l + i] # 拷贝数据到临时数组R for j in range(0, n2): R[j] = a[m + 1 + j] i, j, k = 0, 0, l # 归并L和R到a[l..r] while i < n1 and j < n2: if L[i] <= R[j]: a[k] = L[i] i += 1 else: a[k] = R[j] j += 1 k += 1 # 拷贝L的剩余元素 while i < n1: a[k] = L[i] i += 1 k += 1 # 拷贝R的剩余元素 while j < n2: a[k] = R[j] j += 1 k += 1
递推式
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + f(n), & n > 1 \end{cases} \end{align*} $$
$f(n)$:将两个长度为$n/2$的有序数组合并的时间复杂度
因此$f(n) = \Theta(n)$
$$ \begin{align*} \quad 2 \cdot T \left( \frac{n}{2} \right) + \frac{n}{2} \le T(n) \le 2 \cdot T \left( \frac{n}{2} \right) + n \end{align*} $$
设$n = 2^k$,即$k = \lg n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad 2 \cdot T (2^{k-1}) + 2^{k-1} \le T(2^k) \le 2 \cdot T (2^{k-1}) + 2^k \\ & \heartsuit \qquad 2 \cdot T (2^{k-2}) + 2^{k-2} \le T(2^{k-1}) \le 2 \cdot T (2^{k-2}) + 2^{k-1} \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad 2 \cdot T (2^0) + 2^0 \le T(2^1) \le 2 \cdot T (2^0) + 2^1 \end{align*} $$
令$\spadesuit + \heartsuit \cdot 2 + \cdots + \clubsuit \cdot 2^{k-1}$可得
$$ \begin{align*} \quad n \cdot T(1) + \frac{n}{2} \lg n \le T(n) \le n \cdot T(1) + n \lg n \Longrightarrow T(n) = \Theta(n \lg n) \end{align*} $$
若$n$不是$2$的幂次?
设大于$n$的最小的$2$的幂次为$m$,即$n < m < 2n$
向数组末尾添加$m - n$个$\infty$,再对其进行排序
由于$n \lg n < m \lg m < 2n \lg (2n) = 2n \lg n + 2n < 4n \lg n$
因此$m \lg m = \Theta (n \lg n)$
从而$T(m) = \Theta(m \lg m) = \Theta(\Theta (n \lg n)) = \Theta (n \lg n)$
def quick_sort(a, l, r): if l < r: m = partition(a, l, r) quick_sort(a, l, m-1) quick_sort(a, m+1, r)
def quick_sort(a, l, r): if l < r: m = partition(a, l, r) quick_sort(a, l, m-1) quick_sort(a, m+1, r)
def partition(a, l, r): # 最右元素作为主元 pivot = a[r] # 小于主元的元素的存放位置 初始为最左 i = l # l -> r-1 遍历其他元素 for j in range(l, r): if a[j] <= pivot: # 小于主元的元素放到主元左边 a[i], a[j] = a[j], a[i] i += 1 # 存放位置右移一位 # 所有小于主元的元素已位于主元左边 # 当前的i就是主元应该放的位置 # 当前的a[i]大于主元 a[i], a[r] = a[r], a[i] return i
递推式
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ T(k) + T(n-1-k) + f(n), & n > 1 \end{cases} \end{align*} $$
$$ \begin{align*} \quad T(n) = T (n-1) + \Theta(n) \Longrightarrow T(n) = \Theta(n^2) \end{align*} $$
如何改进?
某股票 17 天内的价格,哪天买进、哪天卖出,收益最大?
| 天 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 价格 | 100 | 113 | 110 | 85 | 105 | 102 | 86 | 63 | 81 | 101 | 94 | 106 | 101 | 79 | 94 | 90 | 97 |
| 变化 | | 13 | -3 | -25 | 20 | -3 | -16 | -23 | 18 | 20 | -7 | 12 | -5 | -22 | 15 | -4 | 7 |
收益 = 后一天的价格 - 前一天的价格,即“变化”数组中的值
第$i$天买进、第$j$天卖出,$\text{总收益} = \text{变化}[i+1] + \cdots + \text{变化}[j]$
问题转化为寻找“变化”数组的和最大的连续子数组
暴力求解:二重 for 循环遍历买进、卖出的日期,$T(n) = \Omega(n^2)$
设当前要寻找子数组$A[low, \ldots, high]$的最大子数组
分:$A[low, \ldots, high] = A[low, \ldots, mid] + A[mid+1, \ldots, high]$
治:递归求$A[low, \ldots, mid]$和$A[mid+1, \ldots, high]$的最大子数组
合:处理第 3 种情况,与前 2 种情况的最大子数组比较取最大
def find_max_subarray(low, high): # 返回最大子数组的起始索引、结束索引、和 if low == high: return low, high, A[low] mid = int((low + high) / 2) # 递归求解左半数组 l_low, l_high, l_sum = find_max_subarray(low, mid) # 递归求解右半数组 r_low, r_high, r_sum = find_max_subarray(mid + 1, high) # 处理跨越中点的情况 c_low, c_high, c_sum = find_max_cross_subarray(low, mid, high) # 比较三种情况: if l_sum > r_sum and l_sum > c_sum: return l_low, l_high, l_sum elif l_sum < c_sum and r_sum < c_sum: return c_low, c_high, c_sum else: return r_low, r_high, r_sum
数组分为$A[i, \ldots, mid]$、$A[mid+1, \ldots, j]$两部分
def find_max_cross_subarray(low, mid, high): l_sum = r_sum = -float("inf") sum = 0 # 找左边的最大子数组 for i in range(mid, low - 1, -1): sum += A[i] if sum > l_sum: l_sum, l_index = sum, i sum = 0 # 找右边的最大子数组 for i in range(mid + 1, high + 1): sum += A[i] if sum > r_sum: r_sum, r_index = sum, i return l_index, r_index, l_sum + r_sum
def find_max_cross_subarray(low, mid, high): l_sum = r_sum = -float("inf") sum = 0 # 找左边的最大子数组 for i in range(mid, low - 1, -1): sum += A[i] if sum > l_sum: l_sum, l_index = sum, i sum = 0 # 找右边的最大子数组 for i in range(mid + 1, high + 1): sum += A[i] if sum > r_sum: r_sum, r_index = sum, i return l_index, r_index, l_sum + r_sum
处理跨越中点的情况只需一重 for 循环,$f(n) = \Theta(n)$
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + \Theta(n), & n > 1 \end{cases} = \Theta(n \lg n) \end{align*} $$
输入:长度为$n$的数组$A$
输出:$A$中逆序对数目
推荐系统中的协同过滤:甲在电影网站上列出了自己最喜爱电影 Top 10,网站如何根据其他用户的电影喜爱列表向甲推荐好友?
构造数组$A[i]$:甲最喜欢的第$i$部电影在乙的列表中的位置
暴力求解:二重 for 循环遍历$(i,j)$,$T(n) = \Omega(n^2)$
设当前要统计子数组$A[low, \ldots, high]$的逆序对数目
分:$A[low, \ldots, high] = A[low, \ldots, mid] + A[mid+1, \ldots, high]$
治:递归求$A[low, \ldots, mid]$和$A[mid+1, \ldots, high]$的逆序对数目
合:跨越中点的逆序对$low \le i \le mid < j \le high$且$A[i] > A[j]$,与两个递归问题的返回值相加求和
问题:跨越中点的逆序对可能有$\Theta(n^2)$个,若一个个数,复杂度至少为$\Omega(n^2)$,如何加速?
合并已经排好序的子数组可以自然地发现逆序对
def count_cross_inv(a, l, m, r): n1, n2 = m - l + 1, r - m # 子数组的长度 L, R = [0] * n1, [0] * n2 # 创建临时数组 for i in range(0, n1): # 拷贝数据到临时数组L L[i] = a[l + i] for j in range(0, n2): # 拷贝数据到临时数组R R[j] = a[m + 1 + j] count = 0 i, j, k = 0, 0, l while i < n1 and j < n2: # 归并L和R到a[l..r] if L[i] <= R[j]: a[k] = L[i] i += 1 else: a[k] = R[j] j += 1 count += (n1 - i) k += 1 while i < n1: # 拷贝L的剩余元素 a[k] = L[i] i += 1 k += 1 while j < n2: # 拷贝R的剩余元素 a[k] = R[j] j += 1 k += 1 return count def sort_and_count(a, l, r): if l < r: m = int((l + r) / 2) # 取中间点分开 left_inv = sort_and_count(a, l, m) right_inv = sort_and_count(a, m + 1, r) cross_inv = count_cross_inv(a, l, m, r) return left_inv + right_inv + cross_inv else: return 0
处理跨越中点的情况只需一重循环,$f(n) = \Theta(n)$
$$ \begin{align*} \quad T(n) & = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + \Theta(n), & n > 1 \end{cases} \\ & = \Theta(n \lg n) \end{align*} $$
输入:$\Rbb^2$上的$n$个点$S = \{ (x_1, y_1), \ldots, (x_n, y_n) \}$
输出:最近点对的距离$d = \min_{i,j} \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$
暴力求解:二重 for 循环遍历所有点对,$T(n) = \Omega(n^2)$
输入:$\Rbb^2$上的$n$个点$S = \{ (x_1, y_1), \ldots, (x_n, y_n) \}$
输出:最近点对的距离$d = \min_{i,j} \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$
分:设$x_1, \ldots, x_n$的中位数为$m$,在$m$处作垂线将$S$均分
治:递归求左半子集和右半子集的最近点对
合:处理第 3 种情况,与前 2 种情况的最近点对比较取最小
设$d_l$、$d_r$分别是递归求解出的左半、右半子集的最近点对距离
只需考虑中线左右两侧宽度为$\delta = \min \{ d_l, d_r \}$的带状区域
def dist(u, v): # 计算距离 return np.linalg.norm(u - v) def closest_pair(low, high, sort_y): if high - low <= 2: # 如果不超过3个点 不再递归 直接求最小距离 closest = float("inf") for i in range(low, high + 1): for j in range(i + 1, high + 1): closest = min(closest, dist(v[i], v[j])) return closest mid = int((low + high) / 2) xmid = (v[mid, 0] + v[mid + 1, 0]) / 2 # 中线位置 index = [i for i in sort_y if low <= i and i <= mid] # 左半元素按y升序后的索引 delta_left = closest_pair(low, mid, index) index = [i for i in sort_y if mid + 1 <= i and i <= high] # 右半元素按y升序后的索引 delta_right = closest_pair(mid + 1, high, index) delta = min(delta_left, delta_right) index = [] for i in sort_y: if v[i, 0] <= xmid + delta and v[i, 0] >= xmid - delta: # 中间带状区域中的点 index.append(i) closest = float("inf") l = len(index) for i in range(l): # 计算带状区域中的最近点对 for j in range(i + 1, min(i + 8, l)): # 最多考虑7个点 closest = min(closest, dist(v[index[i]], v[index[j]])) return min(closest, delta) np.random.seed(0) trial = 8 time_cost = np.zeros([2, trial]) for t in range(trial): n = 2**(t + 4) v = np.random.rand(n, 2) # 随机生成n个点 start = time.perf_counter() closest = float("inf") for i in range(n): for j in range(i + 1, n): closest = min(closest, dist(v[i], v[j])) # 暴力穷举 time_cost[0, t] = time.perf_counter() - start start = time.perf_counter() v = v[v[:, 0].argsort()] # 对v按x升序 sort_y = v[:, 1].argsort() # 对v按y升序 sort_y记录升序后元素在原来v中的索引 closest = closest_pair(0, n - 1, sort_y) time_cost[1, t] = time.perf_counter() - start with plt.style.context('Solarize_Light2'): x = [v for v in range(4, 4 + trial)] plt.plot(x, np.log2(time_cost[0, :]), label="BF") # 对数坐标轴 plt.plot(x, np.log2(time_cost[1, :]), label="DC") # 对数坐标轴 plt.xlabel('n: power of 2') plt.ylabel('time: power of 2') plt.legend() plt.grid(color='#93a1a1', linestyle='-.', linewidth=0.7) plt.savefig("closest-pair2.svg", transparent=True, bbox_inches="tight")
输入:两个$n$位整数$x = X[0, \ldots, n-1]$和$y = Y[0, \ldots, n-1]$
输出:乘积$xy = z = Z[0, \ldots, 2n-1]$
$$ \begin{align*} \quad \sum_{k=0}^{2n-1} Z[k] 10^k & = \left( \sum_{i=0}^{n-1} X[i] 10^i \right) \left( \sum_{j=0}^{n-1} X[j] 10^j \right) = \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} X[i] Y[j] 10^{i+j} \\ & = \sum_{k=0}^{2n-2} \underbrace{\sum_{(i,j): i+j=k} X[i] Y[j]}_{c_k} 10^k = \sum_{k=0}^{2n-2} c_k 10^k \end{align*} $$
$Z[k]$和$c_k$之间的关系?
已知$\sum_{k=0}^{2n-1} Z[k] 10^k = \sum_{k=0}^{2n-2} c_k 10^k$,$Z[k]$和$c_k$的关系?
右边$10^0$的系数$c_0 = \sum_{(i,j): i+j=0} X[i] Y[j] = X[0] Y[0]$,包含两部分
$$ \begin{align*} \quad \begin{cases} c_0 ~ \mathrm{mod} ~ 10, & \text{就是左边$10^0$的系数$Z[0]$} \\ h \leftarrow \lfloor c_0 / 10 \rfloor, & \text{作为进位参与$Z[1]$的计算} \end{cases} \end{align*} $$
右边$10^1$的系数$c_1 = X[0] Y[1] + X[1] Y[0]$,再加上进位$h$
$$ \begin{align*} \quad \begin{cases} (c_1 + h) ~ \mathrm{mod} ~ 10, & \text{就是左边$10^1$的系数$Z[1]$} \\ h \leftarrow \lfloor (c_1 + h) / 10 \rfloor, & \text{作为进位参与$Z[2]$的计算} \end{cases} \end{align*} $$
如此迭代,继续计算$Z[2], Z[3], \ldots, Z[2n-1]$
输入:$X[0, \ldots, n-1]$、$Y[0, \ldots, n-1]$
输出:$Z[0, \ldots, 2n-1]$
算法第 2、3 行的二重 for 循环共遍历$n^2$个$(i,j)$二元组,即二维表中的每一格,因此算法时间复杂度为$\Theta(n^2)$,更好的算法?
def align(x, y, n_x, n_y): # 若x和y长度不同 将短的左边补零 if n_x < n_y: x = x.rjust(n_y, "0") n = n_y else: y = y.rjust(n_x, "0") n = n_x return x, y, n def naive(x, y): n_x, n_y = len(x), len(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 c = 0 # 初始进位为零 z = str() for k in range(2 * n): # 遍历所有满足i+j=k的二元组(i,j) 注意i的范围防止越界 for i in range(max(0, k - n + 1), min(k + 1, n)): j = k - i c += int(x[-1 - i]) * int(y[-1 - j]) c, q = divmod(c, 10) z = str(q) + z z = z.lstrip("0") # 去掉高位连续的0 if z == "": # 若乘积本就是0 去掉所有0后会变成空字符串 return 0 else: return int(z)
将$x$和$y$的数位二等分,记$m = n/2$
共产生$4$个子问题:$ac$、$ad$、$bc$、$bd$,规模大致减半
设$n$位数相乘的时间复杂度为$T(n)$,加法的时间复杂度显然为$\Theta(n)$
$$ \begin{align*} \quad T(n) = 4 \cdot T (n/2) + \Theta(n) \Longrightarrow T(n) = \Theta(n^2) \end{align*} $$
分解成$4$个规模大致减半的子问题并不是更优的算法
利用$a d + b c = (a+b) (c+d) - a c - b d$可得
$$ \begin{align*} \quad x y = a c \cdot 10^{2m} + ((a+b) (c+d) - a c - b d) \cdot 10^m + b d \end{align*} $$
乘法子问题减少为$3$个
$$ \begin{align*} \quad T(n) = 3 \cdot T (n/2) + \Theta(n) \Longrightarrow T(n) = \Theta(n^{\log_2 3}) \end{align*} $$
分解成$3$个规模大致减半的子问题可以得到更优的算法
def karatsuba(x, y): n_x, n_y = len(x), len(y) if n_x == 1 or n_y == 1: # 如果其中一个数只有1位 不再递归 return int(x) * int(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 m = round(n / 2) a, b = x[0:n - m], x[n - m:n] # x = a 10^m + b c, d = y[0:n - m], y[n - m:n] # y = c 10^m + d # 3个递归子问题 ac = karatsuba(a, c) bd = karatsuba(b, d) ad_bc = karatsuba(str(int(a) + int(b)), str(int(c) + int(d))) - ac - bd return ac * 10**(2 * m) + ad_bc * 10**m + bd
将$x$和$y$的数位三等分,记$m = n/3$
$$ \begin{align*} \quad x & = a \cdot 10^{2m} + b \cdot 10^m + c \\ y & = d \cdot 10^{2m} + e \cdot 10^m + f \end{align*} $$
于是
$$ \begin{align*} \quad xy = a d \cdot 10^{4m} + (a e + b d) \cdot 10^{3m} & + (af + be + cd) \cdot 10^{2m} \\ & + (bf + ce) \cdot 10^m + cf \end{align*} $$
共产生$9$个子问题,有点难办!
引入多项式$x(t) = a t^2 + b t + c$、$y(t) = d t^2 + e t + f$
$$ \begin{align*} \quad x(t) y(t) & = a d \cdot t^4 + (a e + b d) \cdot t^3 + (af + be + cd) \cdot t^2 + (bf + ce) \cdot t + cf \\ & \triangleq w_4 t^4 + w_3 t^3 + w_2 t^2 + w_1 t + w_0 \triangleq w(t) \end{align*} $$
代入$t = 10^m$ (在系数后面分别添加$4m$、$3m$、$2m$、$m$、$0$个$0$再相加) 就是$xy$,剩下只需确定$w(t)$的$5$个系数
令$t$取$5$个不同的值可得$5$变量线性方程组:
$$ \begin{align*} \left\{ \begin{array}{lrcl} t = 0 : & cf & = & w_0 \\ t = 1 : & (a+b+c)(d+e+f) & = & w_4 + w_3 + w_2 + w_1 + w_0 \\ t = -1 : & (a-b+c)(d-e+f) & = & w_4 - w_3 + w_2 - w_1 + w_0 \\ t = 2 : & (4a+2b+c)(4d+2e+f) & = & 16 w_4 + 8 w_3 + 4 w_2 + 2 w_1 + w_0 \\ t = \infty : & ad & = & w_4 \end{array} \right. \end{align*} $$
方程组涉及$5$个乘法子问题:
$$ \begin{align*} \begin{array}{l} s_4 \triangleq ad \\ s_3 \triangleq (4a+2b+c)(4d+2e+f) \\ s_2 \triangleq (a-b+c)(d-e+f) \\ s_1 \triangleq (a+b+c)(d+e+f) \\ s_0 \triangleq cf \end{array} \Longrightarrow \begin{array}{l} w_4 = s_4 \\ w_3 = (-12 s_4 + s_3 - s_2 - 3 s_1 + 3 s_0) / 6 \\ w_2 = (-2 s_4 + s_2 + s_1 - 2 s_0) / 2 \\ w_1 = (12 s_4 - s_3 - 2 s_2 + 6 s_1 - 3 s_0) / 6 \\ w_0 = s_0 \end{array} \end{align*} $$
由此可得$T(n) = 5 \cdot T (n/3) + \Theta(n) \Longrightarrow T(n) = \Theta(n^{\log_3 5})$
更一般的将$x$作$k$等分,此时$w(t)$是$2k-2$次多项式,共有$2k-1$个系数,线性方程组需包含$2k-1$个方程,由此产生$2k-1$个子问题,时间复杂度$\Theta(n^{\log_k (2k-1)})$
def preprocess(x): if x[0] == "-": return -1, len(x) - 1, x[1:] return 1, len(x), x def toom(x, y): sign_x, n_x, x = preprocess(x) sign_y, n_y, y = preprocess(y) sign = sign_x * sign_y if n_x <= 2 or n_y <= 2: # 如果其中一个数小于3位 不再递归 return sign * int(x) * int(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 m = round(n / 3) a, b, c = x[0:n - 2 * m], x[n - 2 * m:n - m], x[n - m:n] d, e, f = y[0:n - 2 * m], y[n - 2 * m:n - m], y[n - m:n] # 5个递归子问题 s4 = toom(a, d) # s4 = ad s3 = toom(str(4 * int(a) + 2 * int(b) + int(c)), str(4 * int(d) + 2 * int(e) + int(f))) # s3 = (4a+2b+c)(4d+2e+f) s2 = toom(str(int(a) - int(b) + int(c)), str(int(d) - int(e) + int(f))) # s2 = (a-b+c)(d-e+f) s1 = toom(str(int(a) + int(b) + int(c)), str(int(d) + int(e) + int(f))) # s1 = (a+b+c)(d+e+f) s0 = toom(c, f) # s0 = cf w4 = s4 w3 = (-12 * s4 + s3 - s2 - 3 * s1 + 3 * s0) // 6 w2 = (-2 * s4 + s2 + s1 - 2 * s0) // 2 w1 = (12 * s4 - s3 - 2 * s2 + 6 * s1 - 3 * s0) // 6 w0 = s0 return sign * (w4 * 10**(4 * m) + w3 * 10**(3 * m) + w2 * 10**(2 * m) + w1 * 10**m + w0)
设$\Av = (a_{ij})_{n \times n}$、$\Bv = (b_{ij})_{n \times n}$为$n$阶方阵
设$\Cv = (c_{ij})_{n \times n} = \Av + \Bv$,则$c_{ij} = a_{ij} + b_{ij}$
直接计算$\Cv = \Av + \Bv$的代码如下
def add(A, B, C, m, n): for i in range(m): for j in range(n): C[i, j] = A[i, j] + B[i, j]
因为二重 for 循环的存在,时间复杂度为$\Theta(n^2)$
将$\Av$、$\Bv$、$\Cv$分成$4$个分块矩阵,每块$n/2 \times n/2$
$$ \begin{align*} \quad \Av = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix}, \quad \Bv = \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix}, \quad \Cv = \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} \end{align*} $$
根据分块矩阵的运算法则
$$ \begin{align*} \quad \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} & = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix} + \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix} \\[4px] \Longrightarrow & ~ \begin{cases} \Cv_{11} = \Av_{11} + \Bv_{11} \\ \Cv_{12} = \Av_{12} + \Bv_{12} \\ \Cv_{21} = \Av_{21} + \Bv_{21} \\ \Cv_{22} = \Av_{22} + \Bv_{22} \end{cases} \end{align*} $$
若$n = 1$,则$\Av$、$\Bv$已退化为标量,直接相加
若$n > 1$,将$\Av$、$\Bv$分成四个子矩阵
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + \class{blue}{\Theta(n^2)}, & n > 1 \end{cases} \end{align*} $$
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + \class{blue}{\Theta(1)}, & n > 1 \end{cases} \end{align*} $$
def add_rec(A, B, C, shape): m, n = shape if m == 1 or n == 1: C[:] = A[:] + B[:] else: mid_m, mid_n = int(m / 2), int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid_m), slice(None, mid_n) block_12 = slice(None, mid_m), slice(mid_n, None) block_21 = slice(mid_m, None), slice(None, mid_n) block_22 = slice(mid_m, None), slice(mid_n, None) add_rec(A[block_11], B[block_11], C[block_11], A[block_11].shape) add_rec(A[block_12], B[block_12], C[block_12], A[block_12].shape) add_rec(A[block_21], B[block_21], C[block_21], A[block_21].shape) add_rec(A[block_22], B[block_22], C[block_22], A[block_22].shape)
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + c n^2, & n > 1 \end{cases} \end{align*} $$
设$n = 2^k$,即$k = \lg n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad T(2^k) = 4 \cdot T (2^{k-1}) + c 4^k \\ & \heartsuit \qquad T(2^{k-1}) = 4 \cdot T (2^{k-2}) + c 4^{k-1} \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(2^1) = 4 \cdot T (2^0) + c 4^1 \end{align*} $$
令$\spadesuit + \heartsuit \cdot 4 + \cdots + \clubsuit \cdot 4^{k-1}$可得
$$ \begin{align*} \quad T(n) = 4^k \cdot T(1) + c k \cdot 4^k = n^2 \cdot T(1) + c n^2 \lg n \Longrightarrow T(n) = \Theta(n^2 \lg n) \end{align*} $$
分治递归反而让时间复杂度更坏了
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + c, & n > 1 \end{cases} \end{align*} $$
设$n = 2^k$,即$k = \lg n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad T(2^k) = 4 \cdot T (2^{k-1}) + c \\ & \heartsuit \qquad T(2^{k-1}) = 4 \cdot T (2^{k-2}) + c \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(2^1) = 4 \cdot T (2^0) + c \end{align*} $$
令$\spadesuit + \heartsuit \cdot 4 + \cdots + \clubsuit \cdot 4^{k-1}$可得
$$ \begin{align*} \quad T(n) = 4^k \cdot T(1) + c \frac{1-4^k}{1-4} = n^2 \cdot T(1) + c \frac{n^2-1}{3} \Longrightarrow T(n) = \Theta(n^2) \end{align*} $$
与直接相加的时间复杂度相当,分治递归没有带来收益
设$\Av = (a_{ij})_{n \times n}$、$\Bv = (b_{ij})_{n \times n}$为$n$阶方阵
设$\Cv = (c_{ij})_{n \times n} = \Av \Bv$,则$c_{ij} = \sum_{k=1}^n a_{ik} b_{kj}$
直接计算$\Cv = \Cv + \Av \Bv$的代码如下
def mul(A, B, C, n): for i in range(n): for j in range(n): for k in range(n): C[i, j] += A[i, k] * B[k, j]
因为三重 for 循环的存在,时间复杂度为$\Theta(n^3)$
将$\Av$、$\Bv$、$\Cv$分成$4$个分块矩阵,每块$n/2 \times n/2$
$$ \begin{align*} \quad \Av = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix}, \quad \Bv = \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix}, \quad \Cv = \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} \end{align*} $$
根据分块矩阵的运算法则
$$ \begin{align*} \quad \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} & = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix} \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix} \\[2px] & = \begin{bmatrix} \Av_{11} \Bv_{11} + \Av_{12} \Bv_{21} & \Av_{11} \Bv_{12} + \Av_{12} \Bv_{22} \\ \Av_{21} \Bv_{11} + \Av_{22} \Bv_{21} & \Av_{21} \Bv_{12} + \Av_{22} \Bv_{22} \end{bmatrix} \\[2px] \Longrightarrow & ~ \begin{cases} \Cv_{11} = \Av_{11} \Bv_{11} + \Av_{12} \Bv_{21} \\ \Cv_{12} = \Av_{11} \Bv_{12} + \Av_{12} \Bv_{22} \\ \Cv_{21} = \Av_{21} \Bv_{11} + \Av_{22} \Bv_{21} \\ \Cv_{22} = \Av_{21} \Bv_{12} + \Av_{22} \Bv_{22} \end{cases} \end{align*} $$
若$n = 1$,则$\Av$、$\Bv$已退化为标量,直接相乘
若$n > 1$,将$\Av$、$\Bv$分成四个子矩阵
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + \class{blue}{\Theta(n^2)}, & n > 1 \end{cases} \end{align*} $$
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + \class{blue}{\Theta(1)}, & n > 1 \end{cases} \end{align*} $$
def mul_rec(A, B, C, n): if n == 1: C[0, 0] += A[0, 0] * B[0, 0] else: mid = int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid), slice(None, mid) block_12 = slice(None, mid), slice(mid, None) block_21 = slice(mid, None), slice(None, mid) block_22 = slice(mid, None), slice(mid, None) mul_rec(A[block_11], B[block_11], C[block_11], mid) # C11 += A11 B11 mul_rec(A[block_12], B[block_21], C[block_11], mid) # C11 += A12 B21 mul_rec(A[block_11], B[block_12], C[block_12], mid) # C12 += A11 B12 mul_rec(A[block_12], B[block_22], C[block_12], mid) # C12 += A12 B22 mul_rec(A[block_21], B[block_11], C[block_21], mid) # C21 += A21 B11 mul_rec(A[block_22], B[block_21], C[block_21], mid) # C21 += A22 B21 mul_rec(A[block_21], B[block_12], C[block_22], mid) # C22 += A21 B12 mul_rec(A[block_22], B[block_22], C[block_22], mid) # C22 += A22 B22
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + c n^2, & n > 1 \end{cases} \end{align*} $$
设$n = 2^k$,即$k = \lg n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad T(2^k) = 8 \cdot T (2^{k-1}) + c 4^k \\ & \heartsuit \qquad T(2^{k-1}) = 8 \cdot T (2^{k-2}) + c 4^{k-1} \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(2^1) = 8 \cdot T (2^0) + c 4^1 \end{align*} $$
令$\spadesuit + \heartsuit \cdot 8 + \cdots + \clubsuit \cdot 8^{k-1}$可得
$$ \begin{align*} \quad T(n) & = 8^k \cdot T(1) + c 8^k \left( \frac{4^k}{8^k} + \cdots + \frac{4}{8} \right) = 8^k \cdot T(1) + c 8^k \left( 1 - \frac{4^k}{8^k} \right) \\ & = n^3 \cdot T(1) + c n^3 - c n^2 \Longrightarrow T(n) = \Theta(n^3) \end{align*} $$
$$ \begin{align*} \quad T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + c, & n > 1 \end{cases} \end{align*} $$
设$n = 2^k$,即$k = \lg n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad T(2^k) = 8 \cdot T (2^{k-1}) + c \\ & \heartsuit \qquad T(2^{k-1}) = 8 \cdot T (2^{k-2}) + c \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(2^1) = 8 \cdot T (2^0) + c \end{align*} $$
令$\spadesuit + \heartsuit \cdot 8 + \cdots + \clubsuit \cdot 8^{k-1}$可得
$$ \begin{align*} \quad T(n) = 8^k \cdot T(1) + c \frac{1 - 8^k}{1 - 8} = n^3 \cdot T(1) + c \frac{n^3-1}{7} \Longrightarrow T(n) = \Theta(n^3) \end{align*} $$
与直接相乘的时间复杂度相当,分治递归没有带来收益
| | 方法 | 复制 | 递推式 | 时间复杂度 |
|---|---|---|---|---|
| 加法 | 直接 | | | $\Theta(n^2)$ |
| 分治 | 是 | $T(n) = 4 \cdot T(n/2) + \class{blue}{\Theta(n^2)}$ | $\Theta(n^2 \lg n)$ | |
| 分治 | 否 | $T(n) = 4 \cdot T(n/2) + \class{blue}{\Theta(1)}$ | $\Theta(n^2)$ | |
| 乘法 | 直接 | | | $\Theta(n^3)$ |
| 分治 | 是 | $T(n) = 8 \cdot T(n/2) + \class{blue}{\Theta(n^2)}$ | $\Theta(n^3)$ | |
| 分治 | 否 | $T(n) = 8 \cdot T(n/2) + \class{blue}{\Theta(1)}$ | $\Theta(n^3)$ |
影响时间复杂度的两个因素
$$ \begin{align*} \quad T(n) & = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + \class{blue}{c n^2}, & n > 1 \end{cases} \\ & = 8^k \cdot T(1) + c 8^k \left( \frac{4^k}{8^k} + \cdots + \frac{4}{8} \right) = 8^k \cdot T(1) + c 8^k \left( 1 - \frac{4^k}{8^k} \right) \\ & = n^3 \cdot T(1) + c n^3 - c n^2 \Longrightarrow T(n) = \Theta(n^3) \end{align*} $$
考虑一般情况,设子问题个数为$a$,问题规模为$n/b$,$n = b^k$,由此可得$T(1)$的个数为$a^k = (b^{\log_b a})^k = (b^k)^{\log_b a} = n^{\log_b a}$,于是
$$ \begin{align*} \quad T(n) & = a^k \cdot T(1) + c a^k \left( \frac{b^{2k}}{a^k} + \cdots + \frac{b^2}{a} \right) = a^k \cdot T(1) + c a^k \frac{b^2}{a} \frac{1 - b^{2k} / a^k}{1 - b^2/a} \\ & = a^k \cdot T(1) + \frac{b^2 c}{a-b^2} (a^k - b^{2k}) = n^{\log_b a} \cdot T(1) + \frac{b^2 c}{a-b^2} (n^{\log_b a} - n^2) \\ & \Longrightarrow T(n) = \Theta(n^{\log_b a}) \end{align*} $$
保持$b=2$、$f(n) = \Theta(n^2)$,只要$a < 8$就可以改进时间复杂度
$$ \begin{align*} \quad T(n) & = \begin{cases} 1, & n = 1 \\ \class{blue}{a} \cdot T(n/\class{blue}{b}) + \class{blue}{c n^2}, & n > 1 \end{cases} = n^{\log_b a} \cdot T(1) + \frac{b^2 c}{a-b^2} (n^{\log_b a} - n^2) \end{align*} $$
保持$b=2$、$f(n) = \Theta(n^2)$,只要$a < 8$就可以改进时间复杂度
Strassen 乘法:通过多做小矩阵的加法,少做小矩阵的乘法
直观例子
$2$阶矩阵相乘,标准的矩阵乘法需要做$8$次乘法、$4$次加法
$$ \begin{align*} \quad \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} & a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} & a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} \end{align*} $$
下面给出只做$7$次乘法的实现方法
$$ \begin{align*} \quad \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \underbrace{\begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix}}_{\triangleq ~ \widetilde{\Av}} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \end{align*} $$
$$ \begin{align*} \quad \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \triangleq \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \end{align*} $$
假设$\widetilde{\Av} \in \Rbb^{4 \times 4}$可以分解成$m$个秩$1$矩阵 (列向量乘行向量) 的和
$$ \begin{align*} \quad \widetilde{\Av} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \sum_{i=1}^m r_i \pv_i \qv_i^\top = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \end{align*} $$
且满足
代入$\widetilde{\Av}$的形式有
$$ \begin{align*} \quad & \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \\ & \qquad = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} s_i = \sum_{i=1}^m t_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} = \begin{bmatrix} p_{11} t_1 + \cdots + p_{m1} t_m \\ p_{12} t_1 + \cdots + p_{m2} t_m \\ p_{13} t_1 + \cdots + p_{m3} t_m \\ p_{14} t_1 + \cdots + p_{m4} t_m \end{bmatrix} \end{align*} $$
关键:如何将$\widetilde{\Av}$可以分解成$m$个秩$1$矩阵的和,其中$m < 8$
$$ \begin{align*} \quad \widetilde{\Av} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \sum_{i=1}^m r_i \pv_i \qv_i^\top = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \end{align*} $$
且满足
首先去掉左上的$a_{11}$和右下的$a_{22}$
$$ \begin{align*} \widetilde{\Av} & = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & 0 & a_{11} & 0 \\ 0 & 0 & 0 & 0 \\ a_{11} & 0 & a_{11} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & a_{22} & 0 & a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & a_{22} & 0 & a_{22} \end{bmatrix} \\ & \qquad + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & a_{11} - a_{22} & 0 & a_{12} - a_{22} \\ a_{21} - a_{11} & 0 & a_{22} - a_{11} & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = a_{11} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix}^\top + a_{22} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}^\top + \widetilde{\Av}' \end{align*} $$
再去掉中间的$a_{11} - a_{22}$
$$ \begin{align*} & \quad \widetilde{\Av}' = \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & a_{11} - a_{22} & 0 & a_{12} - a_{22} \\ a_{21} - a_{11} & 0 & a_{22} - a_{11} & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & a_{11} - a_{22} & a_{11} - a_{22} & 0 \\ 0 & a_{22} - a_{11} & a_{22} - a_{11} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{22} - a_{11} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} = (a_{11} - a_{22}) \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}^\top + \widetilde{\Av}'' + \widetilde{\Av}''' \end{align*} $$
最后分解$\widetilde{\Av}''$和$\widetilde{\Av}'''$
$$ \begin{align*} \quad \widetilde{\Av}'' & = \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{22} - a_{11} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & a_{22} - a_{12} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & = (a_{12} - a_{22}) \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix}^\top + (a_{11} - a_{12}) \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 0 \end{bmatrix}^\top \end{align*} $$
最后分解$\widetilde{\Av}''$和$\widetilde{\Av}'''$
$$ \begin{align*} \quad \widetilde{\Av}''' & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{21} & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = (a_{11} - a_{21}) \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top + (a_{21} - a_{22}) \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top \end{align*} $$
$$ \begin{align*} \quad \widetilde{\Av} & = \underbrace{a_{11}}_{r_1} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix}^\top + \underbrace{a_{22}}_{r_2} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}^\top + \underbrace{(a_{11} - a_{22})}_{r_3} \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}^\top \\ & ~ + \underbrace{(a_{12} - a_{22})}_{r_4} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix}^\top + \underbrace{(a_{11} - a_{12})}_{r_5} \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 0 \end{bmatrix}^\top \\ & ~ + \underbrace{(a_{11} - a_{21})}_{r_6} \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top + \underbrace{(a_{21} - a_{22})}_{r_7} \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top = \sum_{i=1}^7 r_i \pv_i \qv_i^\top \end{align*} $$
根据上面的分解
$$ \begin{align*} \begin{bmatrix} s_1 \\ s_2 \\ s_3 \\ s_4 \\ s_5 \\ s_6 \\ s_7 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & -1 & 1 \\ 0 & 0 & -1 & 0 \\ -1 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \begin{bmatrix} b_{11} + b_{21} \\ b_{12} + b_{22} \\ b_{12} + b_{21} \\ b_{22} - b_{21} \\ -b_{21} \\ b_{12} - b_{11} \\ b_{12} \end{bmatrix}, ~ \begin{bmatrix} r_1 \\ r_2 \\ r_3 \\ r_4 \\ r_5 \\ r_6 \\ r_7 \end{bmatrix} = \begin{bmatrix} a_{11} \\ a_{22} \\ a_{11} - a_{22} \\ a_{12} - a_{22} \\ a_{11} - a_{12} \\ a_{11} - a_{21} \\ a_{21} - a_{22} \end{bmatrix} \end{align*} $$
最后计算
$$ \begin{align*} \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} & = t_1 \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} + t_2 \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} + t_3 \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + t_4 \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} + t_5 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \\ & \quad + t_6 \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} + t_7 \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} t_1 + t_5 \\ t_2 + t_3 + t_4 + t_5 \\ t_1 - t_3 + t_6 + t_7 \\ t_2 + t_7 \end{bmatrix} \end{align*} $$
共会产生$8$次加减运算,一共$18$次加减运算
Strassen 矩阵乘法:$8$次乘法、$4$次加法 => $7$次乘法、$18$次加法
def mul_rec_strassen(A, B, C, n): if n == 1: C[0, 0] = A[0, 0] * B[0, 0] else: mid = int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid), slice(None, mid) block_12 = slice(None, mid), slice(mid, None) block_21 = slice(mid, None), slice(None, mid) block_22 = slice(mid, None), slice(mid, None) # 10次加减计算中间变量 R1 = A[block_11] R2 = A[block_22] R3 = A[block_11] - A[block_22] R4 = A[block_12] - A[block_22] R5 = A[block_11] - A[block_12] R6 = A[block_11] - A[block_21] R7 = A[block_21] - A[block_22] S1 = B[block_11] + B[block_21] S2 = B[block_12] + B[block_22] S3 = B[block_12] + B[block_21] S4 = B[block_22] - B[block_21] S5 = -B[block_21] S6 = B[block_12] - B[block_11] S7 = B[block_12] # 7个子问题 递归调用 T1 = np.empty((mid, mid)) mul_rec_strassen(R1, S1, T1, mid) T2 = np.empty((mid, mid)) mul_rec_strassen(R2, S2, T2, mid) T3 = np.empty((mid, mid)) mul_rec_strassen(R3, S3, T3, mid) T4 = np.empty((mid, mid)) mul_rec_strassen(R4, S4, T4, mid) T5 = np.empty((mid, mid)) mul_rec_strassen(R5, S5, T5, mid) T6 = np.empty((mid, mid)) mul_rec_strassen(R6, S6, T6, mid) T7 = np.empty((mid, mid)) mul_rec_strassen(R7, S7, T7, mid) # 8次加减计算C C[block_11] = T1 + T5 C[block_12] = T2 + T3 + T4 + T5 C[block_21] = T1 - T3 + T6 + T7 C[block_22] = T2 + T7
$$ \begin{align*} \quad T(n) & = \begin{cases} 1, & n = 1 \\ \class{blue}{a} \cdot T(n/\class{blue}{b}) + \class{blue}{c n^2}, & n > 1 \end{cases} = n^{\log_b a} \cdot T(1) + \frac{b^2 c}{a-b^2} (n^{\log_b a} - n^2) \end{align*} $$
Strassen 矩阵乘法:$b=2$、$a = 7$、$T(n) = \Theta(n^{\lg 7})$
自 1969 年 V. Strassen 提出上述方法后,后续改进沿用其思路
时隔 9 年,1978 年 V. Pan 发现
2014 年最好结果$T(n) \approx \Theta(n^{2.3728639})$
2021 年最好结果$T(n) \approx \Theta(n^{2.3728596})$,7 年改进了$0.0000043$
三种常用的求解方法
分两步:
如何猜测?
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor) + n, & n > 1 \end{cases}$
这个形式仅比归并排序的递推式多了个向下取整
当$n$很大时取整的影响微乎其微,有理由猜测$T(n) = O(n \lg n)$
下面用数学归纳法证明$T(n) \le c n \lg n$,其中$c$是待定正常数
假设该上界对任意$m < n$都成立,特别的对$\lfloor n/2 \rfloor$也成立
$$ \begin{align*} \quad T(n) & = 2 \cdot T(\lfloor n/2 \rfloor) + n \le 2 c \lfloor n/2 \rfloor \lg \lfloor n/2 \rfloor + n \quad \leftarrow \text{归纳假设} \\ & \le 2 c (n/2) \lg (n/2) + n = cn \lg n - (c-1)n \\ & \le cn \lg n \quad \leftarrow \text{如果} c \ge 1 \end{align*} $$
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor) + n, & n > 1 \end{cases}$,求证$T(n) \le c n \lg n$
最后考虑边界条件,当$n=1$时,则$T(1) = 1 \not \le c \lg 1 = 0$
将边界条件替换为$T(2)$、$T(3)$
取$c=2$可以使上界对$n=2,3$成立
综上:$T(n) \le 2 n \lg n$对任意$n \ge 2$成立,即$T(n) = O(n \lg n)$
之后不再讨论边界条件的证明细节,取充分大的$c$即可
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor + 17) + n, & n > 1 \end{cases}$
这个形式仅比前一个例子多了个$+17$
当$n$很大时$+17$的影响微乎其微,继续猜测$T(n) \le c n \lg n$
$$ \begin{align*} \quad T(n) & = 2 \cdot T(\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (\lfloor n/2 \rfloor + 17) \lg (\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (n/2 + 17) \lg (n/2 + 17) + n \\ & = c (n+34) (\lg (n+34) - 1) + n \end{align*} $$
出现了$\lg (n+34)$,没法推导下去了
修改猜测为$T(n) \le c (n-t) \lg (n-t) = O(n \lg n)$,用$t$把$+34$消掉
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor + 17) + n, & n > 1 \end{cases}$
猜测:$T(n) \le c (n-t) \lg (n-t) = O(n \lg n)$
$$ \begin{align*} \quad T(n) & = 2 \cdot T (\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (\lfloor n/2 \rfloor + 17 - t) \lg (\lfloor n/2 \rfloor + 17 - t) + n \quad \leftarrow \text{归纳假设} \\ & \le c (n + 34 - 2t) (\lg (n + 34 - 2t) - 1) + n \\ & = c (n + 34 - 2t) \lg (n + 34 - 2t) - ((c-1)n - 34c + 2ct) \\ & \le c (n + 34 - 2t) \lg (n + 34 - 2t) \end{align*} $$
令$34 - 2t = -t$得$t = 34$,最后一个不等号成立只需$c \ge 1$
综上,$T(n) \le c (n-34) \lg (n-34) = O(n \lg n)$
向上取整也可以忽略
猜出了正确的渐进界,却卡在了归纳证明
假设不够强,减去一个低阶项
例:$T(n) = \begin{cases} 1, & n = 1 \\ T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1, & n > 1 \end{cases}$
设$T(n) \le cn$,于是
$$ \begin{align*} \quad T(n) & = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1 \\ & \le c \lfloor n/2 \rfloor + c \lceil n/2 \rceil + 1 = cn + 1 \not \le cn \end{align*} $$
例:$T(n) = \begin{cases} 1, & n = 1 \\ T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1, & n > 1 \end{cases}$
设$T(n) \le cn - d$,其中$d$是待定常数,于是
$$ \begin{align*} \quad T(n) & = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1 \\ & \le c \lfloor n/2 \rfloor - d + c \lceil n/2 \rceil - d + 1 = cn - 2d + 1 \le cn - d \end{align*} $$
最后一个不等号成立只需$d \ge 1$
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(n-1) + 1, & n > 1 \end{cases}$
忽略的最后$+1$,有理由猜测$T(n) = O(2^n)$,设$T(n) \le c \cdot 2^n$
$T(n) = 2 \cdot T(n-1) + 1 \le 2 c \cdot 2^{n-1} + 1 = c \cdot 2^n + 1 \not \le c \cdot 2^n$
设$T(n) \le c \cdot 2^n - d$,则
$$ \begin{align*} \quad T(n) & = 2 \cdot T(n-1) + 1 \\ & \le 2 (c \cdot 2^{n-1} - d) + 1 \\ & = c \cdot 2^n - 2 d + 1 \\ & \le c \cdot 2^n - d \end{align*} $$
最后一个不等号成立只需$d \ge 1$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
先令$m = \lg n$即$n = 2^m$去掉$\lg n$,则$T(2^m) = 2 \cdot T (\lfloor \sqrt{2^m} \rfloor) + m$
假设$\sqrt{2^m}$是整数,则$T(2^m) = 2 \cdot T (2^{m/2}) + m$
最后令$S(m) = T(2^m)$,则$S(m) = 2 \cdot S(m/2) + m = \Theta(m \lg m)$
回代可得$T(n) = \Theta( \lg n \cdot \lg \lg n )$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
先证$T(n) = O(\lg n \cdot \lg \lg n)$,设$T(n) \le c \cdot \lg n \cdot \lg \lg n$
$$ \begin{align*} \quad T(n) & = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n \\ & \le 2 c \cdot \lg \lfloor \sqrt{n} \rfloor \cdot \lg (\lg \lfloor \sqrt{n} \rfloor) + \lg n \\ & \le 2 c \cdot \lg \sqrt{n} \cdot \lg (\lg \sqrt{n}) + \lg n \\ & = c \cdot \lg n \cdot \lg (\lg n / 2) + \lg n \\ & = c \cdot \lg n \cdot (\lg \lg n - 1) + \lg n \\ & = c \cdot \lg n \cdot \lg \lg n - (c - 1) \lg n \\ & \le c \cdot \lg n \cdot \lg \lg n \end{align*} $$
最后一个不等号成立只需$c \ge 1$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
再证$T(n) = \Omega (\lg n \cdot \lg \lg n)$,设$T(n) \ge c \cdot \lg (n+3) \cdot \lg \lg (n+3)$
$$ \begin{align*} \quad T(n) & = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n \\ & \ge 2 c \cdot \lg (\lfloor \sqrt{n} \rfloor + 3) \cdot \lg \lg (\lfloor \sqrt{n} \rfloor + 3) + \lg n \\ & \ge 2 c \cdot \lg (\sqrt{n}+2) \cdot \lg \lg (\sqrt{n}+2) + \lg n \\ & \ge 2 c \cdot \lg \sqrt{n+3} \cdot \lg \lg \sqrt{n+3} + \lg n \\ & = c \cdot \lg (n+3) \cdot (\lg \lg (n+3) - 1) + \lg n \\ & = c \cdot \lg (n+3) \cdot \lg \lg (n+3) + \lg n - c \cdot \lg (n+3) \\ & \ge c \cdot \lg (n+3) \cdot \lg \lg (n+3) \end{align*} $$
当$c = 1/2$时,最后一个不等号对$\forall n \ge 3$成立
递归树可用来生成好的猜测
$T(n) = 3 \cdot T(n/4) + c n^2$的递归树如下
$$ \begin{align*} \quad T(n) = 3 \cdot T(n/4) + c n^2 \end{align*} $$
$$ \begin{align*} \quad T(n) = 3 \cdot T(n/4) + c n^2 \end{align*} $$
$$ \begin{align*} \quad T(n) & = n^{\log_4 3} \Theta(1) + cn^2 + \frac{3}{16} cn^2 + \cdots + \left( \frac{3}{16} \right)^{\log_4 n - 1} cn^2 \\ & < \Theta(n^{\log_4 3}) + \sum_{i=0}^\infty \left( \frac{3}{16} \right)^i cn^2 = O(n^2) \end{align*} $$
设$n = 4^k$,即$k = \log_4 n$,根据递推式有
$$ \begin{align*} \quad & \spadesuit \qquad T(4^k) = 3 \cdot T(4^{k-1}) + c \cdot 16^k \\ & \heartsuit \qquad T(4^{k-1}) = 3 \cdot T (4^{k-2}) + c \cdot 16^{k-1} \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(4^1) = 3 \cdot T (4^0) + c \cdot 16^1 \end{align*} $$
令$\spadesuit + \heartsuit \cdot 3 + \cdots + \clubsuit \cdot 3^{k-1}$可得
$$ \begin{align*} \quad T(n) & = 3^k \cdot T(1) + c \cdot 16^k \left( 1 + \frac{3}{16} + \cdots + \frac{3^{k-1}}{16^{k-1}} \right) \\ & = \underbrace{n^{\log_4 3} \cdot T(1)}_{\text{叶结点}} + \underbrace{c n^2 \sum_{i=0}^{k-1} \left( \frac{3}{16} \right)^i}_{\text{内部结点}} \end{align*} $$
猜测$T(n) = 3 \cdot T(n/4) + c n^2 = O(n^2)$
用代入法证明,设$T(n) \le d n^2$,则
$$ \begin{align*} \quad T(n) & = 3 \cdot T(n/4) + c n^2 \\ & \le 3 \cdot d (n/4)^2 + c n^2 \\ & = (3d/16 + c) n^2 \\ & \le d n^2 \end{align*} $$
最后一个不等号成立只需$d \ge (16/13) c$
又$T(n) \ge c n^2 = \Omega(n^2)$的代价,因此$T(n) = \Theta(n^2)$
例:$T(n) = T(n/3) + T(2n/3) + c n$
例:$T(n) = T(n/3) + T(2n/3) + c n$,猜测$T(n) = \Theta(n \lg n)$
注意
$$ \begin{align*} \quad \frac{dn}{3} \lg \frac{n}{3} + \frac{2dn}{3} \lg \frac{2n}{3} & = d n \lg n + \frac{dn}{3} \lg \frac{1}{3} + \frac{2dn}{3} \lg \frac{2}{3} \\ & = d n \lg n - \left( \lg 3 - \frac{2}{3} \right) dn \end{align*} $$
欲证$d_1 n \lg n \le T(n) \le d_2 n \lg n$,只需取$d_1$、$d_2$满足
$$ \begin{align*} \quad T(n) & \le d_2 n \lg n - \left( \lg 3 - \frac{2}{3} \right) d_2 n + cn \le d_2 n \lg n \\ T(n) & \ge d_1 n \lg n - \left( \lg 3 - \frac{2}{3} \right) d_1 n + cn \ge d_1 n \lg n \end{align*} $$
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$,
三种情形都在比较$f(n)$和$n^{\log_b a}$
两点说明:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$,
例:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$,
例:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$,
例:
设$n = b^t$(不等时需证明$T(\lfloor n/b \rfloor)$和$T(\lceil n/b \rceil)$不影响结果),则
$$ \begin{align*} \quad & \spadesuit \qquad T(n/b^0) = a \cdot T(n/b^1) + f(n/b^0) \\ & \heartsuit \qquad T(n/b^1) = a \cdot T(n/b^2) + f(n/b^1) \\ & \qquad \qquad \vdots \\ & \clubsuit \qquad T(n/b^{t-1}) = a \cdot T(n/b^t) + f(n/b^{t-1}) \end{align*} $$
令$\spadesuit + \heartsuit \cdot a + \cdots + \clubsuit \cdot a^{t-1}$可得
$$ \begin{align*} \quad T(n) & = a^t \cdot T(1) + \sum_{i=0}^{t-1} a^i f(n/b^i) = \Theta(n^{\log_b a}) + \sum_{i=0}^{t-1} a^i f(n/b^i) \end{align*} $$
需要比较$n^{\log_b a}$和$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$
以$f(n) = n^d$为例,此时$g(n)$为等比数列:
$$ \begin{align*} \quad g(n) = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i \left( \frac{n}{b^i} \right)^d = n^d \sum_{i=0}^{t-1} \left( \frac{a}{b^d} \right)^i \end{align*} $$
需要比较$n^{\log_b a}$和$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$,其中$a>0$、$b>1$
情形 1:若$\exists \epsilon > 0$使得$f(n) = O(n^{\log_b a - \epsilon})$,则
$$ \begin{align*} \quad g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i O \left( \left( \frac{n}{b^i} \right)^{\log_b a - \epsilon} \right) \\ & = O \left( \sum_{i=0}^{t-1} a^i \left( \frac{n}{b^i} \right)^{\log_b a - \epsilon} \right) = O \left( \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a - \epsilon}}{a^i b^{-\epsilon i}} \right) \\ & = O \left( n^{\log_b a - \epsilon} \sum_{i=0}^{t-1} b^{\epsilon i} \right) = O \left( n^{\log_b a - \epsilon} \frac{1 - b^{\epsilon t}}{1 - b^\epsilon} \right) \overset{n ~ = ~ b^t}{=} O (n^{\log_b a}) \end{align*} $$
因此$n^{\log_b a}$主导最终的界
需要比较$n^{\log_b a}$和$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$,其中$a>0$、$b>1$
情形 2:若$\exists k \ge 0$使得$f(n) = \Theta(n^{\log_b a} \lg^k n)$,则
$$ \begin{align*} \quad g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i \Theta \left( \left( \frac{n}{b^i} \right)^{\log_b a} \lg^k \frac{n}{b^i} \right) \\ & = \Theta \left( \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a}}{a^i} \log_b^k \left( \frac{n}{b^i} \right) \bigg/ \log_b^k 2 \right) = \Theta \left( n^{\log_b a} \sum_{i=0}^{t-1} (t - i)^k \right) \\ & = \Theta \left( n^{\log_b a} \sum_{i=1}^t i^k \right) \end{align*} $$
$\sum_{i=1}^t i^k \le \sum_{i=1}^t t^k = t^{k+1} \Longrightarrow \sum_{i=1}^t i^k = O(\log_b^{k+1} n) = O(\lg^{k+1} n)$
最后只需证$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
需要比较$n^{\log_b a}$和$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$,其中$a>0$、$b>1$
情形 2:若$\exists k \ge 0$使得$f(n) = \Theta(n^{\log_b a} \lg^k n)$,则
$$ \begin{align*} \quad g(n) = \Theta \left( n^{\log_b a} \sum_{i=1}^t i^k \right) \end{align*} $$
已证$\sum_{i=1}^t i^k = O(\lg^{k+1} n)$,最后只需证$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
考虑函数$h(x) = x^k$的黎曼积分,易知$\sum_{i=1}^t i^k$是
$$ \begin{align*} \quad \int_0^t h(x) \diff x = \left. \frac{1}{k+1} x^{k+1} \right|_0^t = \frac{1}{k+1} t^{k+1} = \frac{1}{k+1} \log_b^{k+1} n \end{align*} $$
的上方和,即$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
需要比较$n^{\log_b a}$和$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$,其中$a>0$、$b>1$
情形 3:若$\exists \epsilon > 0$使得$f(n) = \Omega(n^{\log_b a + \epsilon})$且$f$满足正则条件:对$c < 1$和充分大的$x$有$a f(x/b) \le c f(x)$
设$a f(x/b) \le c f(x)$对$\forall x \ge b^{t-q} = n / b^q$都成立,则
$$ \begin{align*} \quad g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^q a^i f \left( \frac{n}{b^i} \right) + \sum_{i=q+1}^{t-1} a^i f \left( \frac{n}{b^i} \right) \\ & \le \sum_{i=0}^q c^i f(n) + \Theta(1) \le f(n) \sum_{i=0}^\infty c^i + \Theta(1) = O(f(n)) \end{align*} $$
由于$f(n)$是$g(n)$求和式中的一项,因此$g(n) = \Omega(f(n))$是显然的
例:$f(n) = n^{\log_b a} / \lg n$,此时主定理的情况 1 不适用
$$ \begin{align*} \quad g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i \left( \frac{n}{b^i} \right)^{\log_b a} \bigg/ \lg \frac{n}{b^i} = \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a}}{a^i} \bigg/ \lg \frac{n}{b^i} \\ & = n^{\log_b a} \sum_{i=0}^{t-1} \log_b 2 \bigg/ \log_b \frac{n}{b^i} \overset{n~=~b^t}{=} n^{\log_b a} \log_b 2 \sum_{i=0}^{t-1} \frac{1}{t-i} \\ & = n^{\log_b a} \log_b 2 \sum_{i=1}^t \frac{1}{i} \end{align*} $$
考虑函数$h(x) = 1/x$的黎曼积分,易知
$$ \begin{align*} \quad \ln (t+1) \le \sum_{i=1}^t \frac{1}{i} \le \ln t + 1 \Longrightarrow \sum_{i=1}^t \frac{1}{i} = \Theta(\ln t) = \Theta(\lg \lg n) \end{align*} $$
故$g(n) = \Theta (n^{\log_b a} \lg \lg n)$占主导,从而$T(n) = \Theta (n^{\log_b a} \lg \lg n)$
Akra-Bazzi 法可以处理子问题非平衡的情形
$$ \begin{align*} \quad T(n) = f(n) + \sum_{i=1}^k a_i \cdot T(n/b_i) \end{align*} $$
其中$k \in \Zbb^+$、$a_1, \ldots, a_k > 0$、$b_1, \ldots, b_k > 1$
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
$$ \begin{align*} \quad T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align*} $$
注意$\sum_{i=1}^k a_i / b_i^p \rightarrow \begin{cases} 0, & p \rightarrow \infty \\ \infty, & p \rightarrow -\infty \end{cases}$,满足要求的$p$总是存在的
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
$$ \begin{align*} \quad T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align*} $$
例:$T(n) = T(n/3) + T(2n/3) + c n$
$a_1 = a_2 = 1$、$b_1 = 3$、$b_2 = 3/2$,$\sum_{i=1}^2 a_i / b_i = 1$,即$p = 1$
$$ \begin{align*} \quad T(n) & = \Theta \left( n \left( 1 + \int_1^n \frac{cx}{x^2} \diff x \right) \right) = \Theta \left( n \left( 1 + c \int_1^n \frac{1}{x} \diff x \right) \right) \\[2px] & = \Theta ( n ( 1 + c \ln x |_1^n ) ) \\[2px] & = \Theta ( n ( 1 + c \ln n ) ) \\[2px] & = \Theta ( n \ln n ) \end{align*} $$
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
$$ \begin{align*} \quad T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align*} $$
主定理不适用例:$T(n) = 2 \cdot T(n/2) + n / \lg n$,$a = b = 2$
$n^{\log_b a} = n$,$n / \lg n \ne O(n^{1 - \epsilon})$,Akra-Bazzi 法的$p = 1$
$$ \begin{align*} \quad T(n) & = \Theta \left( n \left( 1 + \int_1^n \frac{x}{x^2 \lg x} \diff x \right) \right) = \Theta \left( n \left( 1 + \int_1^n \frac{1}{x \lg x} \diff x \right) \right) \\[2px] & = \Theta \left( n \left( 1 + \int_1^n \frac{\ln 2}{\ln x} \diff \ln x \right) \right) = \Theta ( n ( 1 + \ln \ln n ) ) \\[2px] & = \Theta ( n \lg \lg n ) \end{align*} $$
算法导论 3rd
2-4、4.3-2、4.3-9、4.4-6、4.5-1、4.5-4