以二等分为例
分治由如下三个模块组成
分治的时间分析
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ a \cdot T(n/b) + f(n), & n > 1 \end{cases} \end{align}
递推关系的求解方法:代入法、主方法
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ a \cdot T(n/b) + f(n), & n > 1 \end{cases} \end{align}
假设$f(n) = c \cdot n^d$、$n = b^t$,根据递推关系有
\begin{align} \begin{array}{lrclcl} (\spadesuit) & T(b^t) & = & a \cdot T (b^{t-1}) & + & c \cdot b^{td} \\ (\heartsuit) & T(b^{t-1}) & = & a \cdot T (b^{t-2}) & + & c \cdot b^{(t-1)d} \\ & & \vdots & \\ (\clubsuit) & T(b^1) & = & a \cdot T (1) & + & c \cdot b^d \end{array} \end{align}
令$(\spadesuit) + (\heartsuit) \cdot a + \cdots + (\clubsuit) \cdot a^{t-1}$可得
\begin{align} T(n) = a^t \cdot T(1) + c \cdot b^{td} \left( 1 + \frac{a}{b^d} + \left( \frac{a}{b^d} \right)^2 + \cdots + \left( \frac{a}{b^d} \right)^{t-1} \right) \end{align}
其中括号中是公比为$a / b^d$的等比数列求和
\begin{align} T(n) = a^t \cdot T(1) + c \cdot b^{td} \left( 1 + \frac{a}{b^d} + \left( \frac{a}{b^d} \right)^2 + \cdots + \left( \frac{a}{b^d} \right)^{t-1} \right) \end{align}
若公比为$1$,即$a = b^d$,则$T(n) = a^t \cdot T(1) + c \cdot b^{td} \cdot t$
若公比不为$1$,即$a \ne b^d$,则
\begin{align} T(n) = a^t \cdot T(1) + c \cdot b^{td} \cdot \frac{1 - a^t/b^{td}}{1 - a/b^d} = a^t \cdot T(1) + c \cdot \frac{b^{td} - a^t}{1 - a/b^d} \end{align}
注意$n = b^t$,即$t = \log_b n$、$a^t = (b^{\log_b a})^t = (b^t)^{\log_b a} = n^{\log_b a}$,回代有
\begin{align} (\diamondsuit) \quad T(n) = \begin{cases} n^{\log_b a} \cdot T(1) + c \cdot n^{\log_b a} \cdot \log_b n, & 若a = b^d \\ n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d), & 若a \ne b^d \end{cases} \end{align}
本讲后面会经常用到式$(\diamondsuit)$
def merge_sort(a, low, high): if low < high: mid = int((low + high) / 2) # 中间点 merge_sort(a, low, mid) merge_sort(a, mid + 1, high) merge(a, low, mid, high)
def merge_sort(a, low, high): if low < high: mid = int((low + high) / 2) # 中间点 merge_sort(a, low, mid) merge_sort(a, mid + 1, high) merge(a, low, mid, high)
合并:取两个子数组的最小元素做比较,并将小者取出
def merge(a, low, mid, high): # 子数组的长度 n1, n2 = mid - low + 1, high - mid # 创建临时数组 L, R = [0] * n1, [0] * n2 # 拷贝数据到临时数组L for i in range(n1): L[i] = a[low + i] # 拷贝数据到临时数组R for j in range(n2): R[j] = a[mid + 1 + j] i, j, k = 0, 0, low # 归并L和R到a[low,...,high] while i < n1 and j < n2: if L[i] <= R[j]: a[k] = L[i] i += 1 else: a[k] = R[j] j += 1 k += 1 # 拷贝L的剩余元素 while i < n1: a[k] = L[i] i += 1 k += 1 # 拷贝R的剩余元素 while j < n2: a[k] = R[j] j += 1 k += 1
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + f(n), & n > 1 \end{cases} \end{align}
其中$f(n) = \Theta(n)$为将两个长度为$n/2$的有序数组合并的时间
代入式$(\diamondsuit)$,注意$a = 2$、$b = 2$、$d = 1$,故$a = b^d$、$\log_b a = 1$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot n^{\log_b a} \cdot \log_b n \\ & = n \cdot T(1) + c \cdot n \cdot \log_2 n \\ & = \Theta(n \lg n) \end{align}
式$(\diamondsuit)$假设了$n = b^t$,即$n$是$2$的幂次
若假设不成立,考虑大于$n$的最小的$2$的幂次$m$,即$n < m < 2n$
向数组末尾添加$m - n$个$\infty$,再对其进行排序
由于$n \lg n < m \lg m < 2n \lg (2n) = 2n \lg n + 2n < 4n \lg n$
因此$m \lg m = \Theta (n \lg n)$
从而$T(m) = \Theta(m \lg m) = \Theta(\Theta (n \lg n)) = \Theta (n \lg n)$
输入:长度为$n$的数组$a$
输出:$a$中逆序对数目
推荐系统中的协同过滤:甲在电影网站上列出了自己最喜爱电影 Top 10,网站如何根据其他用户的电影喜爱列表向甲推荐好友?
构造数组$a[i]$:甲最喜欢的第$i$部电影在乙的列表中的位置
暴力求解:二重 for 循环遍历$(i,j)$,$T(n) = \Omega(n^2)$
设当前要统计子数组$a[low, \ldots, high]$的逆序对数目
问题:跨越中点的逆序对可能本身就有$\Theta(n^2)$个,若一个个计数,时间复杂度至少为$\Omega(n^2)$
加速关键:一次计数多个逆序对
设$(a[i], a[j])$是跨越中点的逆序对,若$a[i] \le a[i+1] \le \cdots \le a[mid]$,即升序排列,则$(a[i+1], a[j]), \ldots, (a[mid], a[j])$都是跨越中点的逆序对
归并排序也是将数组二等分,$a[low, \ldots, mid]$的升序可由归并排序实现
归并排序的“合”步也是左半、右半子数组各取一个元素比大小,逆序对的计数可以融入“合”步
时间复杂度与归并排序相同
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + \Theta(n), & n > 1 \end{cases} \Longrightarrow T(n) = \Theta(n \lg n) \end{align}
def count_cross_inv(a, low, mid, high): n1, n2 = mid - low + 1, high - mid # 子数组的长度 L, R = [0] * n1, [0] * n2 # 创建临时数组 for i in range(0, n1): # 拷贝数据到临时数组L L[i] = a[low + i] for j in range(0, n2): # 拷贝数据到临时数组R R[j] = a[mid + 1 + j] count = 0 i, j, k = 0, 0, low while i < n1 and j < n2: # 归并L和R到a[low,...,high] if L[i] <= R[j]: # 跨越中点的顺序对 a[k] = L[i] i += 1 else: # 跨越中点的逆序对 a[k] = R[j] j += 1 count += (n1 - i) # L[i], ..., L[n1-1]均与R[j]构成逆序对 k += 1 while i < n1: # 拷贝L的剩余元素 a[k] = L[i] i += 1 k += 1 while j < n2: # 拷贝R的剩余元素 a[k] = R[j] j += 1 k += 1 return count def sort_and_count(a, low, high): if low < high: mid = int((low + high) / 2) # 取中间点分开 left_inv = sort_and_count(a, low, mid) right_inv = sort_and_count(a, mid + 1, high) cross_inv = count_cross_inv(a, low, mid, high) return left_inv + right_inv + cross_inv else: return 0
def quick_sort(a, low, high): if low < high: m = partition(a, low, high) quick_sort(a, low, m - 1) quick_sort(a, m + 1, high)
def quick_sort(a, low, high): if low < high: m = partition(a, low, high) quick_sort(a, low, m - 1) quick_sort(a, m + 1, high)
def partition(a, low, high): # 最右元素作为主元 pivot = a[high] # 小于主元的元素的存放位置 初始为最左 i = low # low -> high-1 遍历其它元素 for j in range(low, high): if a[j] <= pivot: # 小于主元的元素放到主元左边 a[i], a[j] = a[j], a[i] i += 1 # 存放位置右移一位 # 所有小于主元的元素已位于主元左边 # 当前的i就是主元应该放的位置 # 当前的a[i]大于主元 a[i], a[high] = a[high], a[i] return i
递推关系
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ T(n) = T(k) + T(n-1-k) + \Theta(n), & n > 1 \end{cases} \end{align}
如何改进?
某股票 17 天内的价格,哪天买进、哪天卖出,收益最大?
| 天 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 价格 | 100 | 113 | 110 | 85 | 105 | 102 | 86 | 63 | 81 | 101 | 94 | 106 | 101 | 79 | 94 | 90 | 97 |
| 变化 | | 13 | -3 | -25 | 20 | -3 | -16 | -23 | 18 | 20 | -7 | 12 | -5 | -22 | 15 | -4 | 7 |
收益 = 后一天的价格 - 前一天的价格,即“变化”数组中的值
第$i$天买进、第$j$天卖出,$总收益 = 变化[i+1] + \cdots + 变化[j]$
问题转化为寻找“变化”数组的和最大的连续子数组
暴力求解:二重 for 循环遍历买进、卖出的日期,$T(n) = \Omega(n^2)$
设当前要寻找子数组$A[low, \ldots, high]$的最大子数组
分:$A[low, \ldots, high] = A[low, \ldots, mid] + A[mid+1, \ldots, high]$
治:递归求$A[low, \ldots, mid]$和$A[mid+1, \ldots, high]$的最大子数组
合:处理第 3 种情况,与前 2 种情况的最大子数组比较取最大
def find_max_subarray(low, high): # 返回最大子数组的起始索引、结束索引、和 if low == high: return low, high, A[low] mid = int((low + high) / 2) # 递归求解左半数组 l_low, l_high, l_sum = find_max_subarray(low, mid) # 递归求解右半数组 r_low, r_high, r_sum = find_max_subarray(mid + 1, high) # 处理跨越中点的情况 c_low, c_high, c_sum = find_max_cross_subarray(low, mid, high) # 比较三种情况: if l_sum > r_sum and l_sum > c_sum: return l_low, l_high, l_sum elif l_sum < c_sum and r_sum < c_sum: return c_low, c_high, c_sum else: return r_low, r_high, r_sum
数组分为$A[i, \ldots, mid]$、$A[mid+1, \ldots, j]$两部分
def find_max_cross_subarray(low, mid, high): l_sum = r_sum = -float("inf") sum = 0 # 找结束于mid的最大子数组的开始位置 for i in range(mid, low - 1, -1): # 遍历 mid -> low sum += A[i] if sum > l_sum: l_sum, l_index = sum, i sum = 0 # 找开始于mid+1的最大子数组的结束位置 for i in range(mid + 1, high + 1): # 遍历 mid+1 -> high sum += A[i] if sum > r_sum: r_sum, r_index = sum, i return l_index, r_index, l_sum + r_sum
def find_max_cross_subarray(low, mid, high): l_sum = r_sum = -float("inf") sum = 0 # 找结束于mid的最大子数组的开始位置 for i in range(mid, low - 1, -1): # 遍历 mid -> low sum += A[i] if sum > l_sum: l_sum, l_index = sum, i sum = 0 # 找开始于mid+1的最大子数组的结束位置 for i in range(mid + 1, high + 1): # 遍历 mid+1 -> high sum += A[i] if sum > r_sum: r_sum, r_index = sum, i return l_index, r_index, l_sum + r_sum
处理跨越中点的情况只需一重 for 循环,$f(n) = \Theta(n)$
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T (n/2) + \Theta(n), & n > 1 \end{cases} \Longrightarrow T(n) = \Theta(n \lg n) \end{align}
输入:$\rb^2$上的$n$个点$\sc = \{ (x_1, y_1), \ldots, (x_n, y_n) \}$
输出:最近点对的距离$d = \min_{i,j} \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$
暴力求解:二重 for 循环遍历所有点对,$T(n) = \Omega(n^2)$
输入:$\rb^2$上的$n$个点$\sc = \{ (x_1, y_1), \ldots, (x_n, y_n) \}$
输出:最近点对的距离$d = \min_{i,j} \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2}$
分:设$x_1, \ldots, x_n$的中位数为$m$,在$m$处作垂线将$\sc$均分
治:递归求左半子集和右半子集的最近点对
合:处理第 3 种情况,与前 2 种情况的最近点对比较取最小
设$d_l$、$d_r$分别是递归求解出的左半、右半子集的最近点对距离
只需考虑中线左右两侧宽度为$\delta = \min \{ d_l, d_r \}$的带状区域
def dist(u, v): # 计算距离 return np.linalg.norm(u - v) def closest_pair(low, high, sort_y): if high - low <= 2: # 如果不超过3个点 不再递归 直接求最小距离 closest = float("inf") for i in range(low, high + 1): for j in range(i + 1, high + 1): closest = min(closest, dist(v[i], v[j])) return closest mid = int((low + high) / 2) xmid = (v[mid, 0] + v[mid + 1, 0]) / 2 # 中线位置 index = [i for i in sort_y if low <= i and i <= mid] # 左半元素按y升序后的索引 delta_left = closest_pair(low, mid, index) index = [i for i in sort_y if mid + 1 <= i and i <= high] # 右半元素按y升序后的索引 delta_right = closest_pair(mid + 1, high, index) delta = min(delta_left, delta_right) index = [] for i in sort_y: if v[i, 0] <= xmid + delta and v[i, 0] >= xmid - delta: # 中间带状区域中的点 index.append(i) closest = float("inf") l = len(index) for i in range(l): # 计算带状区域中的最近点对 for j in range(i + 1, min(i + 8, l)): # 最多考虑7个点 closest = min(closest, dist(v[index[i]], v[index[j]])) return min(closest, delta) np.random.seed(0) trial = 8 time_cost = np.zeros([2, trial]) for t in range(trial): n = 2**(t + 4) v = np.random.rand(n, 2) # 随机生成n个点 start = time.perf_counter() closest = float("inf") for i in range(n): for j in range(i + 1, n): closest = min(closest, dist(v[i], v[j])) # 暴力穷举 time_cost[0, t] = time.perf_counter() - start start = time.perf_counter() v = v[v[:, 0].argsort()] # 对v按x升序 sort_y = v[:, 1].argsort() # 对v按y升序 sort_y记录升序后元素在原来v中的索引 closest = closest_pair(0, n - 1, sort_y) time_cost[1, t] = time.perf_counter() - start with plt.style.context('Solarize_Light2'): x = [v for v in range(4, 4 + trial)] plt.plot(x, np.log2(time_cost[0, :]), label="BF") # 对数坐标轴 plt.plot(x, np.log2(time_cost[1, :]), label="DC") # 对数坐标轴 plt.xlabel('n: power of 2') plt.ylabel('time: power of 2') plt.legend() plt.grid(color='#93a1a1', linestyle='-.', linewidth=0.7) plt.savefig("closest-pair.svg", transparent=True, bbox_inches="tight")
输入:两个$n$位整数$x = X[0, \ldots, n-1]$和$y = Y[0, \ldots, n-1]$
输出:乘积$xy = z = Z[0, \ldots, 2n-1]$
\begin{align} \sum_{k=0}^{2n-1} Z[k] 10^k & = z = xy = \left( \sum_{i=0}^{n-1} X[i] 10^i \right) \left( \sum_{j=0}^{n-1} X[j] 10^j \right) \\ & = \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} X[i] Y[j] 10^{i+j} = \sum_{k=0}^{2n-2} \underbrace{\class{blue}{\sum_{(i,j): i+j=k} X[i] Y[j]}}_{c_k} 10^k \end{align}
$Z[k]$和$c_k$之间的关系?
已知$\sum_{k=0}^{2n-1} Z[k] 10^k = \sum_{k=0}^{2n-2} c_k 10^k$,其中$c_k = \sum_{(i,j): i+j=k} X[i] Y[j]$是若干个一位正整数的乘积,而$Z[k]$是一位正整数,它们的关系?
$c_0 = \sum_{(i,j): i+j=0} X[i] Y[j] = X[0] Y[0]$
$c_1 = \sum_{(i,j): i+j=1} X[i] Y[j] = X[0] Y[1] + X[1] Y[0]$,再加上进位$h$
如此迭代,继续计算$Z[2], Z[3], \ldots, Z[2n-1]$
以$x = 123$、$y = 456$、$z = 56088$为例
| $c_k$ | $h$ | $c_k + h$ | $Z[k]$ |
|---|---|---|---|
| $c_0 = 3 \times 6 = 18$ | $0$ | $\class{red}{1}8$ | $Z[0] = 8$ |
| $c_1 = 3 \times 5 + 2 \times 6 = 27$ | $\class{red}{1}$ | $\class{yellow}{2}8$ | $Z[1] = 8$ |
| $c_2 = 3 \times 4 + 2 \times 5 + 1 \times 6 = 28$ | $\class{yellow}{2}$ | $\class{blue}{3}0$ | $Z[2] = 0$ |
| $c_3 = 2 \times 4 + 1 \times 5 = 13$ | $\class{blue}{3}$ | $\class{cyan}{1}6$ | $Z[3] = 6$ |
| $c_4 = 1 \times 4 = 4$ | $\class{cyan}{1}$ | $5$ | $Z[4] = 5$ |
输入:$X[0, \ldots, n-1]$、$Y[0, \ldots, n-1]$
输出:$Z[0, \ldots, 2n-1]$
二重 for 循环共遍历$n^2$个$(i,j)$二元组,因此$T(n) = \Theta(n^2)$
def align(x, y, n_x, n_y): # 若x和y长度不同 将短的左边补零 if n_x < n_y: x = x.rjust(n_y, "0") n = n_y else: y = y.rjust(n_x, "0") n = n_x return x, y, n def grade_school(x, y): n_x, n_y = len(x), len(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 c = 0 # 初始进位为零 z = str() for k in range(2 * n): # 遍历所有满足i+j=k的二元组(i,j) 注意i的范围防止越界 for i in range(max(0, k - n + 1), min(k + 1, n)): j = k - i c += int(x[-1 - i]) * int(y[-1 - j]) c, q = divmod(c, 10) z = str(q) + z z = z.lstrip("0") # 去掉高位连续的0 if z == "": # 若乘积本就是0 去掉所有0后会变成空字符串 return 0 else: return int(z)
将$x$和$y$的数位二等分,记$m = n/2$
乘积$z = xy = (a \cdot 10^m + b)(c \cdot 10^m + d) = a c \cdot 10^{2m} + (a d + b c) 10^m + b d$
设$n$位数相乘的时间为$T(n)$,补零、加法的时间为$c_1$、$c_2$
\begin{align} T(n) = 4 \cdot T (n/2) + c_1 \cdot 3n + c_2 \cdot 4n = 4 \cdot T (n/2) + \Theta(n) \end{align}
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + c n, & n > 1 \end{cases} \end{align}
代入式$(\diamondsuit)$,注意$a = 4$、$b = 2$、$d = 1$,故$a \ne b^d$、$\log_b a = 2$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^2 \cdot T(1) + c \cdot (n^1 - n^2)/(1 - 4 / 2^1) \\ & = \Theta(n^2) \end{align}
分解成$4$个规模大致减半的子问题并不改进时间复杂度
利用$a d + b c = (a+b) (c+d) - a c - b d$可得
\begin{align} x y = a c \cdot 10^{2m} + ((a+b) (c+d) - a c - b d) \cdot 10^m + b d \end{align}
$n/2$位数相乘减少为$3$个:$(a+b) (c+d)$、$a c$、$b d$,补零不变,加法次数变多但时间复杂度依然为$\Theta(n)$
\begin{align} T(n) = 3 \cdot T (n/2) + \Theta(n) \end{align}
代入式$(\diamondsuit)$,注意$a = 3$、$b = 2$、$d = 1$,故$a \ne b^d$、$\log_b a = \lg 3$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^{\lg 3} \cdot T(1) + c \cdot (n^1 - n^{\lg 3})/(1 - 3/2^1) \\ & = \Theta(n^{\lg 3}) \approx \Theta(n^{1.58496250}) \end{align}
分解成$3$个规模大致减半的子问题可以改进时间复杂度
def karatsuba(x, y): n_x, n_y = len(x), len(y) if n_x == 1 or n_y == 1: # 如果其中一个数只有1位 不再递归 return int(x) * int(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 m = round(n / 2) a, b = x[0:n - m], x[n - m:n] # x = a 10^m + b c, d = y[0:n - m], y[n - m:n] # y = c 10^m + d # 3个递归子问题 ac = karatsuba(a, c) bd = karatsuba(b, d) ad_bc = karatsuba(str(int(a) + int(b)), str(int(c) + int(d))) - ac - bd return ac * 10**(2 * m) + ad_bc * 10**m + bd
将$x$和$y$的数位三等分,记$m = n/3$
\begin{align} x & = a \cdot 10^{2m} + b \cdot 10^m + c \\ y & = d \cdot 10^{2m} + e \cdot 10^m + f \end{align}
于是
\begin{align} xy = a d \cdot 10^{4m} + (a e + b d) \cdot 10^{3m} & + (af + be + cd) \cdot 10^{2m} \\ & + (bf + ce) \cdot 10^m + cf \end{align}
共产生$9$个子问题,有点难办!
引入多项式$x(t) = a t^2 + b t + c$、$y(t) = d t^2 + e t + f$
\begin{align} x(t) y(t) & = a d \cdot t^4 + (a e + b d) \cdot t^3 + (af + be + cd) \cdot t^2 + (bf + ce) \cdot t + cf \\ & \triangleq w(t) = w_4 t^4 + w_3 t^3 + w_2 t^2 + w_1 t + w_0 \end{align}
代入$t = 10^m$ (在系数后面分别添加$4m$、$3m$、$2m$、$m$、$0$个$0$再相加) 就是$xy$,剩下只需确定$w(t)$的$5$个系数
令$t$取$5$个不同的值可得$5$变量线性方程组:
\begin{align} \begin{array}{lrcl} t = 0 : & cf & = & w_0 \\ t = 1 : & (a+b+c)(d+e+f) & = & w_4 + w_3 + w_2 + w_1 + w_0 \\ t = -1 : & (a-b+c)(d-e+f) & = & w_4 - w_3 + w_2 - w_1 + w_0 \\ t = 2 : & (4a+2b+c)(4d+2e+f) & = & 16 w_4 + 8 w_3 + 4 w_2 + 2 w_1 + w_0 \\ t = \infty : & ad & = & w_4 \end{array} \end{align}
方程组涉及$5$个乘法子问题:
\begin{align} \begin{array}{l} s_4 \triangleq ad \\ s_3 \triangleq (4a+2b+c)(4d+2e+f) \\ s_2 \triangleq (a-b+c)(d-e+f) \\ s_1 \triangleq (a+b+c)(d+e+f) \\ s_0 \triangleq cf \end{array} \Longrightarrow \begin{array}{l} w_4 = s_4 \\ w_3 = (-12 s_4 + s_3 - s_2 - 3 s_1 + 3 s_0) / 6 \\ w_2 = (-2 s_4 + s_2 + s_1 - 2 s_0) / 2 \\ w_1 = (12 s_4 - s_3 - 2 s_2 + 6 s_1 - 3 s_0) / 6 \\ w_0 = s_0 \end{array} \end{align}
由此可得$T(n) = 5 \cdot T (n/3) + \Theta(n)$
代入式$(\diamondsuit)$,注意$a = 5$、$b = 3$、$d = 1$,故$a \ne b^d$、$\log_b a = \log_3 5$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^{\log_3 5} \cdot T(1) + c \cdot (n^1 - n^{\log_3 5})/(1 - 5/3^1) = \Theta(n^{\log_3 5}) \approx \Theta(n^{1.46497352}) \end{align}
比 Karatsuba 算法的$\Theta(n^{\lg_3}) \approx \Theta(n^{1.58496250})$更优
def preprocess(x): if x[0] == "-": return -1, len(x) - 1, x[1:] return 1, len(x), x def toom(x, y): sign_x, n_x, x = preprocess(x) sign_y, n_y, y = preprocess(y) sign = sign_x * sign_y if n_x <= 2 or n_y <= 2: # 如果其中一个数小于3位 不再递归 return sign * int(x) * int(y) x, y, n = align(x, y, n_x, n_y) # 若x和y长度不同 将短的左边补零 m = round(n / 3) a, b, c = x[0:n - 2 * m], x[n - 2 * m:n - m], x[n - m:n] d, e, f = y[0:n - 2 * m], y[n - 2 * m:n - m], y[n - m:n] # 5个递归子问题 s4 = toom(a, d) # s4 = ad s3 = toom(str(4 * int(a) + 2 * int(b) + int(c)), str(4 * int(d) + 2 * int(e) + int(f))) # s3 = (4a+2b+c)(4d+2e+f) s2 = toom(str(int(a) - int(b) + int(c)), str(int(d) - int(e) + int(f))) # s2 = (a-b+c)(d-e+f) s1 = toom(str(int(a) + int(b) + int(c)), str(int(d) + int(e) + int(f))) # s1 = (a+b+c)(d+e+f) s0 = toom(c, f) # s0 = cf w4 = s4 w3 = (-12 * s4 + s3 - s2 - 3 * s1 + 3 * s0) // 6 w2 = (-2 * s4 + s2 + s1 - 2 * s0) // 2 w1 = (12 * s4 - s3 - 2 * s2 + 6 * s1 - 3 * s0) // 6 w0 = s0 return sign * (w4 * 10**(4 * m) + w3 * 10**(3 * m) + w2 * 10**(2 * m) + w1 * 10**m + w0)
将$x$、$y$作$k$等分,引入多项式
\begin{align} x(t) = a_{k-1} t^{k-1} + a_{k-2} t^{k-2} + \cdots + a_0, \quad y(t) = b_{k-1} t^{k-1} + b_{k-2} t^{k-2} + \cdots + b_0 \end{align}
\begin{align} x(t) y(t) & = a_{k-1} b_{k-1} \cdot t^{2k-2} + (a_{k-1} b_{k-2} + a_{k-2} b_{k-1}) \cdot t^{2k-3} + \cdots + a_0 b_0 \\ & \triangleq w(t) = w_{2k-2} t^{2k-2} + w_{2k-3} t^{2k-3} + \cdots + w_0 \end{align}
$w(t)$共有$2k-1$个系数,故$t$需取$2k-1$个不同的值,由此产生$2k-1$个规模为$n/k$的子问题,代入式$(\diamondsuit)$,注意$a = 2k-1$、$b = k$、$d = 1$,故$a \ne b^d$、$\log_b a = \log_k (2k-1)$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^{\log_k (2k-1)} \cdot T(1) + c \cdot (n^1 - n^{\log_k (2k-1)})/(1 - (2k-1) / k^1) \\ & = \Theta(n^{\log_k (2k-1)}) \end{align}
Karatsuba 算法和 Toom 算法都是该结论的特例,易证$\log_k (2k-1)$关于$k$严格单调减,故$k$越大越好,但$k$越大,关于$w(t)$的$2k-1$个系数的线性方程组也越复杂,实际需取一个适中的$k$
设$\Av = (a_{ij})_{n \times n}$、$\Bv = (b_{ij})_{n \times n}$为$n$阶方阵
设$\Cv = (c_{ij})_{n \times n} = \Av + \Bv$,则$c_{ij} = a_{ij} + b_{ij}$
直接计算$\Cv = \Av + \Bv$的代码如下
def add(A, B, m, n): C = np.empty((m, n)) for i in range(m): for j in range(n): C[i, j] = A[i, j] + B[i, j] return C
因为二重 for 循环的存在,$T(n) = \Theta(n^2)$
将$\Av$、$\Bv$、$\Cv$分成$4$个分块矩阵,每块$n/2 \times n/2$
\begin{align} \Av = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix}, \quad \Bv = \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix}, \quad \Cv = \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} \end{align}
根据分块矩阵的运算法则
\begin{align} \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} & = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix} + \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix} \\[4px] & \Longrightarrow \begin{cases} \Cv_{11} = \Av_{11} + \Bv_{11} \\ \Cv_{12} = \Av_{12} + \Bv_{12} \\ \Cv_{21} = \Av_{21} + \Bv_{21} \\ \Cv_{22} = \Av_{22} + \Bv_{22} \end{cases} \end{align}
若$n = 1$,则$\Av$、$\Bv$已退化为标量,直接相加
若$n > 1$,将$\Av$、$\Bv$分成四个子矩阵
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + \class{blue}{\Theta(n^2)}, & n > 1 \end{cases} \end{align}
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + \class{blue}{\Theta(1)}, & n > 1 \end{cases} \end{align}
def add_rec(A, B, C, shape): m, n = shape if m == 1 or n == 1: C[:] = A[:] + B[:] else: mid_m, mid_n = int(m / 2), int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid_m), slice(None, mid_n) block_12 = slice(None, mid_m), slice(mid_n, None) block_21 = slice(mid_m, None), slice(None, mid_n) block_22 = slice(mid_m, None), slice(mid_n, None) add_rec(A[block_11], B[block_11], C[block_11], A[block_11].shape) add_rec(A[block_12], B[block_12], C[block_12], A[block_12].shape) add_rec(A[block_21], B[block_21], C[block_21], A[block_21].shape) add_rec(A[block_22], B[block_22], C[block_22], A[block_22].shape)
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + c n^2, & n > 1 \end{cases} \end{align}
代入式$(\diamondsuit)$,注意$a = 4$、$b = 2$、$d = 2$,故$a = b^d$、$\log_b a = 2$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot n^{\log_b a} \cdot \log_b n \\ & = n^2 \cdot T(1) + c \cdot n^2 \cdot \log_2 n \\ & = \Theta(n^2 \lg n) \end{align}
分治反而让时间复杂度更坏了
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 4 \cdot T(n/2) + c, & n > 1 \end{cases} \end{align}
代入式$(\diamondsuit)$,注意$a = 4$、$b = 2$、$d = 0$,故$a \ne b^d$、$\log_b a = 2$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^2 \cdot T(1) + c \cdot (n^0 - n^2)/(1 - 4/2^0) \\ & = \Theta(n^2) \end{align}
与直接相加的时间复杂度相当,分治没有带来收益
设$\Av = (a_{ij})_{n \times n}$、$\Bv = (b_{ij})_{n \times n}$为$n$阶方阵
设$\Cv = (c_{ij})_{n \times n} = \Av \Bv$,则$c_{ij} = \sum_{k=1}^n a_{ik} b_{kj}$
直接计算$\Cv = \Av \Bv$的代码如下
def mul(A, B, n): C = np.zeros((n, n)) for i in range(n): for j in range(n): for k in range(n): C[i, j] += A[i, k] * B[k, j] return C
因为三重 for 循环的存在,$T(n) = \Theta(n^3)$
将$\Av$、$\Bv$、$\Cv$分成$4$个分块矩阵,每块$n/2 \times n/2$
\begin{align} \Av = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix}, \quad \Bv = \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix}, \quad \Cv = \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} \end{align}
根据分块矩阵的运算法则
\begin{align} \begin{bmatrix} \Cv_{11} & \Cv_{12} \\ \Cv_{21} & \Cv_{22} \end{bmatrix} & = \begin{bmatrix} \Av_{11} & \Av_{12} \\ \Av_{21} & \Av_{22} \end{bmatrix} \begin{bmatrix} \Bv_{11} & \Bv_{12} \\ \Bv_{21} & \Bv_{22} \end{bmatrix} \\[2px] & = \begin{bmatrix} \Av_{11} \Bv_{11} + \Av_{12} \Bv_{21} & \Av_{11} \Bv_{12} + \Av_{12} \Bv_{22} \\ \Av_{21} \Bv_{11} + \Av_{22} \Bv_{21} & \Av_{21} \Bv_{12} + \Av_{22} \Bv_{22} \end{bmatrix} \\[2px] & \Longrightarrow \begin{cases} \Cv_{11} = \Av_{11} \Bv_{11} + \Av_{12} \Bv_{21} \\ \Cv_{12} = \Av_{11} \Bv_{12} + \Av_{12} \Bv_{22} \\ \Cv_{21} = \Av_{21} \Bv_{11} + \Av_{22} \Bv_{21} \\ \Cv_{22} = \Av_{21} \Bv_{12} + \Av_{22} \Bv_{22} \end{cases} \end{align}
若$n = 1$,则$\Av$、$\Bv$已退化为标量,直接相乘
若$n > 1$,将$\Av$、$\Bv$分成四个子矩阵
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + \class{blue}{\Theta(n^2)}, & n > 1 \end{cases} \end{align}
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + \class{blue}{\Theta(1)}, & n > 1 \end{cases} \end{align}
def mul_rec(A, B, C, n): if n == 1: C[0, 0] += A[0, 0] * B[0, 0] else: mid = int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid), slice(None, mid) block_12 = slice(None, mid), slice(mid, None) block_21 = slice(mid, None), slice(None, mid) block_22 = slice(mid, None), slice(mid, None) mul_rec(A[block_11], B[block_11], C[block_11], mid) # C11 += A11 B11 mul_rec(A[block_12], B[block_21], C[block_11], mid) # C11 += A12 B21 mul_rec(A[block_11], B[block_12], C[block_12], mid) # C12 += A11 B12 mul_rec(A[block_12], B[block_22], C[block_12], mid) # C12 += A12 B22 mul_rec(A[block_21], B[block_11], C[block_21], mid) # C21 += A21 B11 mul_rec(A[block_22], B[block_21], C[block_21], mid) # C21 += A22 B21 mul_rec(A[block_21], B[block_12], C[block_22], mid) # C22 += A21 B12 mul_rec(A[block_22], B[block_22], C[block_22], mid) # C22 += A22 B22
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + c n^2, & n > 1 \end{cases} \end{align}
代入式$(\diamondsuit)$,注意$a = 8$、$b = 2$、$d = 2$,故$a \ne b^d$、$\log_b a = 3$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^3 \cdot T(1) + c \cdot (n^2 - n^3)/(1 - 8/2^2) \\ & = \Theta(n^3) \end{align}
\begin{align} T(n) = \begin{cases} 1, & n = 1 \\ 8 \cdot T(n/2) + c, & n > 1 \end{cases} \end{align}
代入式$(\diamondsuit)$,注意$a = 8$、$b = 2$、$d = 0$,故$a \ne b^d$、$\log_b a = 3$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^3 \cdot T(1) + c \cdot (n^0 - n^3)/(1 - 8/2^0) \\ & = \Theta(n^3) \end{align}
与直接相乘的时间复杂度相当,分治没有带来收益
| | 方法 | 复制 | 递推关系 | 时间复杂度 |
|---|---|---|---|---|
| 加法 | 直接 | | | $\Theta(n^2)$ |
| 分治 | 是 | $T(n) = 4 \cdot T(n/2) + \class{blue}{\Theta(n^2)}$ | $\Theta(n^2 \lg n)$ | |
| 分治 | 否 | $T(n) = 4 \cdot T(n/2) + \class{blue}{\Theta(1)}$ | $\Theta(n^2)$ | |
| 乘法 | 直接 | | | $\Theta(n^3)$ |
| 分治 | 是 | $T(n) = 8 \cdot T(n/2) + \class{blue}{\Theta(n^2)}$ | $\Theta(n^3)$ | |
| 分治 | 否 | $T(n) = 8 \cdot T(n/2) + \class{blue}{\Theta(1)}$ | $\Theta(n^3)$ |
考虑一般情况,设每次分治
假设$a > b^d = 4$,代入式$(\diamondsuit)$有
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^{\lg a} \cdot T(1) + c \cdot (n^2 - n^{\lg a})/(1 - a/4) \\ & = \Theta (n^{\lg a}) \end{align}
只要$a < 8$就能优于$\Theta(n^3)$
\begin{align} T(n) = n^{\lg a} \cdot T(1) + c \cdot (n^2 - n^{\lg a})/(1 - a/4) = \Theta (n^{\lg a}) \end{align}
只要$a < 8$就能优于$\Theta(n^3)$
Strassen 乘法:通过多做子矩阵加法来少做子矩阵乘法
直观例子
$2$阶矩阵相乘,标准的矩阵乘法需要做$8$次乘法、$4$次加法
\begin{align} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} & a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} & a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} \end{align}
下面给出只做$\class{blue}{7}$次乘法的实现方法
\begin{align} \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \underbrace{\begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix}}_{\triangleq ~ \widetilde{\Av}} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \end{align}
\begin{align} \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \triangleq \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \end{align}
假设$\widetilde{\Av} \in \rb^{4 \times 4}$可以分解成$m$个秩$\class{blue}{1}$矩阵 (列向量乘行向量) 的和
\begin{align} \widetilde{\Av} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \sum_{i=1}^m r_i \pv_i \qv_i^\top = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \end{align}
且满足
代入$\widetilde{\Av}$的形式有
\begin{align} & \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} = \widetilde{\Av} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} \\ & \qquad = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} s_i = \sum_{i=1}^m t_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} = \begin{bmatrix} p_{11} t_1 + \cdots + p_{m1} t_m \\ p_{12} t_1 + \cdots + p_{m2} t_m \\ p_{13} t_1 + \cdots + p_{m3} t_m \\ p_{14} t_1 + \cdots + p_{m4} t_m \end{bmatrix} \end{align}
关键:如何将$\widetilde{\Av}$可以分解成$m$个秩$\class{blue}{1}$矩阵的和,其中$m < 8$
\begin{align} \widetilde{\Av} = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \sum_{i=1}^m r_i \pv_i \qv_i^\top = \sum_{i=1}^m r_i \begin{bmatrix} p_{i1} \\ p_{i2} \\ p_{i3} \\ p_{i4} \end{bmatrix} \begin{bmatrix} q_{i1} \\ q_{i2} \\ q_{i3} \\ q_{i4} \end{bmatrix}^\top \end{align}
且满足
首先去掉左上的$a_{11}$和右下的$a_{22}$
\begin{align} \widetilde{\Av} & = \begin{bmatrix} a_{11} & 0 & a_{12} & 0 \\ 0 & a_{11} & 0 & a_{12} \\ a_{21} & 0 & a_{22} & 0 \\ 0 & a_{21} & 0 & a_{22} \end{bmatrix} = \begin{bmatrix} a_{11} & 0 & a_{11} & 0 \\ 0 & 0 & 0 & 0 \\ a_{11} & 0 & a_{11} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & a_{22} & 0 & a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & a_{22} & 0 & a_{22} \end{bmatrix} \\ & \qquad + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & a_{11} - a_{22} & 0 & a_{12} - a_{22} \\ a_{21} - a_{11} & 0 & a_{22} - a_{11} & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = a_{11} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix}^\top + a_{22} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}^\top + \widetilde{\Av}' \end{align}
再去掉中间的$a_{11} - a_{22}$
\begin{align} & \widetilde{\Av}' = \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & a_{11} - a_{22} & 0 & a_{12} - a_{22} \\ a_{21} - a_{11} & 0 & a_{22} - a_{11} & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & a_{11} - a_{22} & a_{11} - a_{22} & 0 \\ 0 & a_{22} - a_{11} & a_{22} - a_{11} & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{22} - a_{11} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} = (a_{11} - a_{22}) \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}^\top + \widetilde{\Av}'' + \widetilde{\Av}''' \end{align}
最后分解$\widetilde{\Av}''$和$\widetilde{\Av}'''$
\begin{align} \widetilde{\Av}'' & = \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{22} - a_{11} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & a_{22} - a_{12} & a_{12} - a_{22} \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & a_{12} - a_{11} & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \\ & = (a_{12} - a_{22}) \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix}^\top + (a_{11} - a_{12}) \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 0 \end{bmatrix}^\top \end{align}
最后分解$\widetilde{\Av}''$和$\widetilde{\Av}'''$
\begin{align} \widetilde{\Av}''' & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ a_{21} - a_{11} & a_{11} - a_{21} & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \\ 0 & a_{21} - a_{22} & 0 & 0 \end{bmatrix} \\ & = (a_{11} - a_{21}) \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top + (a_{21} - a_{22}) \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top \end{align}
\begin{align} \widetilde{\Av} & = \underbrace{a_{11}}_{r_1} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix}^\top + \underbrace{a_{22}}_{r_2} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}^\top + \underbrace{(a_{11} - a_{22})}_{r_3} \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}^\top \\ & ~ + \underbrace{(a_{12} - a_{22})}_{r_4} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix}^\top + \underbrace{(a_{11} - a_{12})}_{r_5} \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \\ -1 \\ 0 \end{bmatrix}^\top \\ & ~ + \underbrace{(a_{11} - a_{21})}_{r_6} \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top + \underbrace{(a_{21} - a_{22})}_{r_7} \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}^\top = \sum_{i=1}^7 r_i \pv_i \qv_i^\top \end{align}
根据上面的分解
\begin{align} \begin{bmatrix} s_1 \\ s_2 \\ s_3 \\ s_4 \\ s_5 \\ s_6 \\ s_7 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & -1 & 1 \\ 0 & 0 & -1 & 0 \\ -1 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} b_{11} \\ b_{12} \\ b_{21} \\ b_{22} \end{bmatrix} = \begin{bmatrix} b_{11} + b_{21} \\ b_{12} + b_{22} \\ b_{12} + b_{21} \\ b_{22} - b_{21} \\ -b_{21} \\ b_{12} - b_{11} \\ b_{12} \end{bmatrix}, ~ \begin{bmatrix} r_1 \\ r_2 \\ r_3 \\ r_4 \\ r_5 \\ r_6 \\ r_7 \end{bmatrix} = \begin{bmatrix} a_{11} \\ a_{22} \\ a_{11} - a_{22} \\ a_{12} - a_{22} \\ a_{11} - a_{12} \\ a_{11} - a_{21} \\ a_{21} - a_{22} \end{bmatrix} \end{align}
最后计算
\begin{align} \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} \\ a_{21} b_{12} + a_{22} b_{22} \end{bmatrix} & = t_1 \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix} + t_2 \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} + t_3 \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + t_4 \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} + t_5 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} \\ & ~ + t_6 \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} + t_7 \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} t_1 + t_5 \\ t_2 + t_3 + t_4 + t_5 \\ t_1 - t_3 + t_6 + t_7 \\ t_2 + t_7 \end{bmatrix} \end{align}
共会产生$8$次加减运算,一共$18$次加减运算
Strassen 矩阵乘法:$8$次乘法、$4$次加法 => $7$次乘法、$18$次加法
def mul_rec_strassen(A, B, C, n): if n == 1: C[0, 0] = A[0, 0] * B[0, 0] else: mid = int(n / 2) # 四个子矩阵的索引 block_11 = slice(None, mid), slice(None, mid) block_12 = slice(None, mid), slice(mid, None) block_21 = slice(mid, None), slice(None, mid) block_22 = slice(mid, None), slice(mid, None) # 10次加减计算中间变量 R1 = A[block_11] R2 = A[block_22] R3 = A[block_11] - A[block_22] R4 = A[block_12] - A[block_22] R5 = A[block_11] - A[block_12] R6 = A[block_11] - A[block_21] R7 = A[block_21] - A[block_22] S1 = B[block_11] + B[block_21] S2 = B[block_12] + B[block_22] S3 = B[block_12] + B[block_21] S4 = B[block_22] - B[block_21] S5 = -B[block_21] S6 = B[block_12] - B[block_11] S7 = B[block_12] # 7个子问题 递归调用 T1 = np.empty((mid, mid)) mul_rec_strassen(R1, S1, T1, mid) T2 = np.empty((mid, mid)) mul_rec_strassen(R2, S2, T2, mid) T3 = np.empty((mid, mid)) mul_rec_strassen(R3, S3, T3, mid) T4 = np.empty((mid, mid)) mul_rec_strassen(R4, S4, T4, mid) T5 = np.empty((mid, mid)) mul_rec_strassen(R5, S5, T5, mid) T6 = np.empty((mid, mid)) mul_rec_strassen(R6, S6, T6, mid) T7 = np.empty((mid, mid)) mul_rec_strassen(R7, S7, T7, mid) # 8次加减计算C C[block_11] = T1 + T5 C[block_12] = T2 + T3 + T4 + T5 C[block_21] = T1 - T3 + T6 + T7 C[block_22] = T2 + T7
Strassen 矩阵乘法:$a = 7$、$b = 2$、$T(n) = \Theta(n^{\lg 7})$
自 1969 年 V. Strassen 提出上述方法后,后续改进沿用其思路
时隔 9 年,1978 年 V. Pan 发现
2014 年最好结果$T(n) \approx \Theta(n^{2.3728639})$
2021 年最好结果$T(n) \approx \Theta(n^{2.3728596})$,7 年改进了$0.0000043$
常用的求解方法
分两步:
如何猜测?
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor) + n, & n > 1 \end{cases}$
这个形式仅比归并排序的递推关系多了个向下取整
当$n$很大时取整的影响微乎其微,有理由猜测$T(n) = O(n \lg n)$
下面用数学归纳法证明$T(n) \le c n \lg n$,其中$c$是待定正常数
假设该上界对任意$m < n$都成立,特别的对$\lfloor n/2 \rfloor$也成立
\begin{align} T(n) & = 2 \cdot T(\lfloor n/2 \rfloor) + n \\ & \le 2 c \lfloor n/2 \rfloor \lg \lfloor n/2 \rfloor + n \quad \gets 归纳假设 \\ & \le 2 c (n/2) \lg (n/2) + n \\ & = cn \lg n - (c-1)n \\ & \le cn \lg n \quad \gets 如果 c \ge 1 \end{align}
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor) + n, & n > 1 \end{cases}$,求证$T(n) \le c n \lg n$
最后考虑边界条件,当$n=1$时,则$T(1) = 1 \not \le c \lg 1 = 0$
将边界条件替换为$T(2)$、$T(3)$
取$c=2$可以使上界对$n=2,3$成立
综上:$T(n) \le 2 n \lg n$对任意$n \ge 2$成立,即$T(n) = O(n \lg n)$
之后不再讨论边界条件的证明细节,取充分大的$c$即可
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor + 17) + n, & n > 1 \end{cases}$
这个形式仅比前一个例子多了个$+17$
当$n$很大时$+17$的影响微乎其微,继续猜测$T(n) \le c n \lg n$
\begin{align} T(n) & = 2 \cdot T(\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (\lfloor n/2 \rfloor + 17) \lg (\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (n/2 + 17) \lg (n/2 + 17) + n \\ & = c (n+34) (\lg (n+34) - 1) + n \end{align}
出现了$\lg (n+34)$,没法推导下去了
修改猜测为$T(n) \le c (n-t) \lg (n-t) = O(n \lg n)$,用$t$把$+34$消掉
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(\lfloor n/2 \rfloor + 17) + n, & n > 1 \end{cases}$
猜测:$T(n) \le c (n-t) \lg (n-t) = O(n \lg n)$
\begin{align} T(n) & = 2 \cdot T (\lfloor n/2 \rfloor + 17) + n \\ & \le 2 c (\lfloor n/2 \rfloor + 17 - t) \lg (\lfloor n/2 \rfloor + 17 - t) + n \quad \gets \text{归纳假设} \\ & \le c (n + 34 - 2t) (\lg (n + 34 - 2t) - 1) + n \\ & = c (n + 34 - 2t) \lg (n + 34 - 2t) - ((c-1)n - 34c + 2ct) \\ & \le c (n + 34 - 2t) \lg (n + 34 - 2t) \end{align}
令$34 - 2t = -t$得$t = 34$,最后一个不等号成立只需$c \ge 1$
综上,$T(n) \le c (n-34) \lg (n-34) = O(n \lg n)$
向上取整等于向下取整加$1$,故向上取整也可忽略
猜出了正确的渐进界,却卡在了归纳证明
假设不够强,减去一个低阶项
例:$T(n) = \begin{cases} 1, & n = 1 \\ T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1, & n > 1 \end{cases}$
设$T(n) \le cn$,于是
\begin{align} T(n) & = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1 \\ & \le c \lfloor n/2 \rfloor + c \lceil n/2 \rceil + 1 \\ & = cn + 1 \not \le cn \end{align}
例:$T(n) = \begin{cases} 1, & n = 1 \\ T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1, & n > 1 \end{cases}$
设$T(n) \le cn - d$,其中$d$是待定常数,于是
\begin{align} T(n) & = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1 \\ & \le c \lfloor n/2 \rfloor - d + c \lceil n/2 \rceil - d + 1 \\ & = cn - 2d + 1 \\ & \le cn - d \end{align}
最后一个不等号成立只需$d \ge 1$
例:$T(n) = \begin{cases} 1, & n = 1 \\ 2 \cdot T(n-1) + 1, & n > 1 \end{cases}$
忽略的最后$+1$,有理由猜测$T(n) = O(2^n)$,设$T(n) \le c \cdot 2^n$
\begin{align} T(n) = 2 \cdot T(n-1) + 1 \le 2 c \cdot 2^{n-1} + 1 = c \cdot 2^n + 1 \not \le c \cdot 2^n \end{align}
设$T(n) \le c \cdot 2^n - d$,则
\begin{align} T(n) & = 2 \cdot T(n-1) + 1 \\ & \le 2 (c \cdot 2^{n-1} - d) + 1 \\ & = c \cdot 2^n - 2 d + 1 \\ & \le c \cdot 2^n - d \end{align}
最后一个不等号成立只需$d \ge 1$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
先令$m = \lg n$即$n = 2^m$去掉$\lg n$,则$T(2^m) = 2 \cdot T (\lfloor \sqrt{2^m} \rfloor) + m$
忽略取整有$T(2^m) = 2 \cdot T (2^{m/2}) + m$
令$S(m) = T(2^m)$,则$S(m) = 2 \cdot S(m/2) + m = \Theta(m \lg m)$
回代可得$T(n) = \Theta( \lg n \cdot \lg \lg n )$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
先证$T(n) = O(\lg n \cdot \lg \lg n)$,设$T(n) \le c \cdot \lg n \cdot \lg \lg n$
\begin{align} T(n) & = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n \\ & \le 2 c \cdot \lg \lfloor \sqrt{n} \rfloor \cdot \lg (\lg \lfloor \sqrt{n} \rfloor) + \lg n \\ & \le 2 c \cdot \lg \sqrt{n} \cdot \lg (\lg \sqrt{n}) + \lg n \\ & = c \cdot \lg n \cdot \lg (\lg n / 2) + \lg n \\ & = c \cdot \lg n \cdot (\lg \lg n - 1) + \lg n \\ & = c \cdot \lg n \cdot \lg \lg n - (c - 1) \lg n \\ & \le c \cdot \lg n \cdot \lg \lg n \end{align}
最后一个不等号成立只需$c \ge 1$
例:$T(n) = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n$
再证$T(n) = \Omega (\lg n \cdot \lg \lg n)$,设$T(n) \ge c \cdot \lg (n+3) \cdot \lg \lg (n+3)$
\begin{align} T(n) & = 2 \cdot T (\lfloor \sqrt{n} \rfloor) + \lg n \\ & \ge 2 c \cdot \lg (\lfloor \sqrt{n} \rfloor + 3) \cdot \lg \lg (\lfloor \sqrt{n} \rfloor + 3) + \lg n \\ & \ge 2 c \cdot \lg (\sqrt{n}+2) \cdot \lg \lg (\sqrt{n}+2) + \lg n \\ & \ge 2 c \cdot \lg \sqrt{n+3} \cdot \lg \lg \sqrt{n+3} + \lg n \\ & = c \cdot \lg (n+3) \cdot (\lg \lg (n+3) - 1) + \lg n \\ & = c \cdot \lg (n+3) \cdot \lg \lg (n+3) + \lg n - c \cdot \lg (n+3) \\ & \ge c \cdot \lg (n+3) \cdot \lg \lg (n+3) \end{align}
当$c = 1/2$时,最后一个不等号对$\forall n \ge 3$成立
递归树可用来生成好的猜测
$T(n) = 3 \cdot T(n/4) + c n^2$的递归树如下
\begin{align} T(n) = 3 \cdot T(n/4) + c n^2 \end{align}
\begin{align} T(n) = 3 \cdot T(n/4) + c n^2 \end{align}
\begin{align} T(n) & = n^{\log_4 3} \cdot \Theta(1) + cn^2 \frac{1 - 3^{\log_4 n} / 16^ {\log_4 n}}{1 - 3/16} \\ & = n^{\log_4 3} \cdot \Theta(1) + c \frac{16}{13} ( n^2 - n^{\log_4 3} ) \end{align}
\begin{align} T(n) = 3 \cdot T(n/4) + c n^2 \end{align}
递归树的结果
\begin{align} T(n) = n^{\log_4 3} \cdot \Theta(1) + c \frac{16}{13} ( n^2 - n^{\log_4 3} ) \end{align}
累加求和的结果,代入式$(\diamondsuit)$,注意$a = 3$、$b = 4$、$d = 2$,故$a \ne b^d$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + c \cdot (n^d - n^{\log_b a})/(1 - a/b^d) \\ & = n^{\log_4 3} \cdot T(1) + c \cdot (n^2 - n^{\log_4 3})/(1 - 3/4^2) \\ & = n^{\log_4 3} \cdot T(1) + c \cdot \frac{16}{13} \cdot (n^2 - n^{\log_4 3}) \end{align}
猜测$T(n) = 3 \cdot T(n/4) + c n^2 = O(n^2)$
用代入法证明,设$T(n) \le d n^2$,则
\begin{align} T(n) & = 3 \cdot T(n/4) + c n^2 \\ & \le 3 \cdot d (n/4)^2 + c n^2 \\ & = (3d/16 + c) n^2 \\ & \le d n^2 \end{align}
最后一个不等号成立只需$d \ge (16/13) c$
又$T(n) \ge c n^2 = \Omega(n^2)$的代价,因此$T(n) = \Theta(n^2)$
例:$T(n) = T(n/3) + T(2n/3) + c n$
例:$T(n) = T(n/3) + T(2n/3) + c n$,猜测$T(n) = \Theta(n \lg n)$
注意
\begin{align} \frac{dn}{3} \lg \frac{n}{3} + \frac{2dn}{3} \lg \frac{2n}{3} & = d n \lg n + \frac{dn}{3} \lg \frac{1}{3} + \frac{2dn}{3} \lg \frac{2}{3} \\ & = d n \lg n - \left( \lg 3 - \frac{2}{3} \right) dn \end{align}
欲证$d_1 n \lg n \le T(n) \le d_2 n \lg n$,只需取$d_1$、$d_2$满足
\begin{align} T(n) & \le d_2 n \lg n - \left( \lg 3 - \frac{2}{3} \right) d_2 n + cn \le d_2 n \lg n \\ T(n) & \ge d_1 n \lg n - \left( \lg 3 - \frac{2}{3} \right) d_1 n + cn \ge d_1 n \lg n \end{align}
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$
三种情形都在比较$f(n)$和$n^{\log_b a}$
两点说明:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$
例:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$
例:
主定理:设$T(n) = a \cdot T(n/b) + f(n)$,其中$a>0$、$b>1$
例:
以$f(n) = n^d$为例,设$n = b^t$,回顾式$(\diamondsuit)$的推导过程有
\begin{align} T(n) = n^{\log_b a} \cdot T(1) + \underbrace{n^d \left( 1 + \frac{a}{b^d} + \left( \frac{a}{b^d} \right)^2 + \cdots + \left( \frac{a}{b^d} \right)^{t-1} \right)}_{g(n)} \end{align}
若$a = b^d$,则$\log_b a = d \Longrightarrow n^{\log_b a} = n^d = f(n)$,等比数列的元素都是$1$
\begin{align} T(n) & = n^{\log_b a} \cdot T(1) + n^d \cdot t \\ & = n^{\log_b a} \cdot T(1) + n^{\log_b a} \cdot \log_b n \\ & = \Theta(n^{\log_b a} \log_b n) \\ & = \Theta(n^{\log_b a} \lg n) \end{align}
对应主定理情形 2 中$k=0$的情况
以$f(n) = n^d$为例,设$n = b^t$,回顾式$(\diamondsuit)$的推导过程有
\begin{align} T(n) = n^{\log_b a} \cdot T(1) + \underbrace{n^d \left( 1 + \frac{a}{b^d} + \left( \frac{a}{b^d} \right)^2 + \cdots + \left( \frac{a}{b^d} \right)^{t-1} \right)}_{g(n)} \end{align}
若$a > b^d$,则$\log_b a > d \Longrightarrow n^{\log_b a} > n^d = f(n)$,等比数列公比$a / b^d > 1$
\begin{align} g(n) & = n^d \cdot \frac{1 - a^t / b^{td}}{1 - a / b^d} = \frac{a^t - b^{td}}{a / b^d - 1} < \frac{a^t}{a / b^d - 1} = \frac{n^{\log_b a}}{a / b^d - 1} \\ & \Longrightarrow n^{\log_b a} \cdot T(1) < T(n) < n^{\log_b a} \cdot T(1) + \frac{n^{\log_b a}}{a / b^d - 1} \\ & \Longrightarrow T(n) = \Theta(n^{\log_b a}) \end{align}
对应主定理情形 1
以$f(n) = n^d$为例,设$n = b^t$,回顾式$(\diamondsuit)$的推导过程有
\begin{align} T(n) = n^{\log_b a} \cdot T(1) + \underbrace{n^d \left( 1 + \frac{a}{b^d} + \left( \frac{a}{b^d} \right)^2 + \cdots + \left( \frac{a}{b^d} \right)^{t-1} \right)}_{g(n)} \end{align}
若$a < b^d$,则$\log_b a < d \Longrightarrow n^{\log_b a} < n^d = f(n)$,$T(n)$由$g(n)$主导
显然$g(n) > n^d = f(n)$,若想有$g(n) = \Theta(f(n))$,等比数列的和得是常数 (不发散),注意等比数列公比$a / b^d < 1$,其和不超过$1 / (1 - a / b^d)$,因此只需在渐进意义下 ($n$充分大时),公比不无限趋近$1$,即存在常数$c < 1$使得$a / b^d \le c \Longrightarrow a f(x/b) = a (x/b)^d \le c x^d = c f(x)$,这正是主定理情形 3 中的正则条件
对于一般的$f(n)$,$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$
情形 1:若$\exists \epsilon > 0$使得$f(n) = O(n^{\log_b a - \epsilon})$,则
\begin{align} g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i O \left( \left( \frac{n}{b^i} \right)^{\log_b a - \epsilon} \right) \\ & = O \left( \sum_{i=0}^{t-1} a^i \left( \frac{n}{b^i} \right)^{\log_b a - \epsilon} \right) = O \left( \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a - \epsilon}}{a^i b^{-\epsilon i}} \right) \\ & = O \left( n^{\log_b a - \epsilon} \sum_{i=0}^{t-1} b^{\epsilon i} \right) = O \left( n^{\log_b a - \epsilon} \frac{1 - b^{\epsilon t}}{1 - b^\epsilon} \right) \overset{n ~ = ~ b^t}{=} O (n^{\log_b a}) \end{align}
因此$n^{\log_b a}$主导$T(n)$
对于一般的$f(n)$,$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$
情形 2:若$\exists k \ge 0$使得$f(n) = \Theta(n^{\log_b a} \lg^k n)$,则
\begin{align} g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i \Theta \left( \left( \frac{n}{b^i} \right)^{\log_b a} \lg^k \frac{n}{b^i} \right) \\ & = \Theta \left( \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a}}{a^i} \log_b^k \left( \frac{n}{b^i} \right) \bigg/ \log_b^k 2 \right) = \Theta \left( n^{\log_b a} \sum_{i=0}^{t-1} (t - i)^k \right) \\ & = \Theta \left( n^{\log_b a} \sum_{i=1}^t i^k \right) \end{align}
$\sum_{i=1}^t i^k \le \sum_{i=1}^t t^k = t^{k+1} \Longrightarrow \sum_{i=1}^t i^k = O(\log_b^{k+1} n) = O(\lg^{k+1} n)$
最后只需证$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
对于一般的$f(n)$,$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$
情形 2:若$\exists k \ge 0$使得$f(n) = \Theta(n^{\log_b a} \lg^k n)$,则
\begin{align} g(n) = \Theta \left( n^{\log_b a} \sum_{i=1}^t i^k \right) \end{align}
已证$\sum_{i=1}^t i^k = O(\lg^{k+1} n)$,最后只需证$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
考虑函数$h(x) = x^k$的黎曼积分,易知$\sum_{i=1}^t i^k$是
\begin{align} \int_0^t h(x) \diff x = \left. \frac{1}{k+1} x^{k+1} \right|_0^t = \frac{1}{k+1} t^{k+1} = \frac{1}{k+1} \log_b^{k+1} n \end{align}
的上方和,即$\sum_{i=1}^t i^k = \Omega(\lg^{k+1} n)$
对于一般的$f(n)$,$g(n) = \sum_{i=0}^{t-1} a^i f(n/b^i)$
情形 3:若$\exists \epsilon > 0$使得$f(n) = \Omega(n^{\log_b a + \epsilon})$且$f$满足正则条件:对$c < 1$和充分大的$x$有$a f(x/b) \le c f(x)$
设$a f(x/b) \le c f(x)$对$\forall x \ge b^{t-q} = n / b^q$都成立,则
\begin{align} g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^q a^i f \left( \frac{n}{b^i} \right) + \sum_{i=q+1}^{t-1} a^i f \left( \frac{n}{b^i} \right) \\ & \le \sum_{i=0}^q c^i f(n) + \Theta(1) \le f(n) \sum_{i=0}^\infty c^i + \Theta(1) = O(f(n)) \end{align}
由于$f(n)$是$g(n)$求和式中的一项,因此$g(n) = \Omega(f(n))$是显然的
例:$f(n) = n^{\log_b a} / \lg n$,此时主定理的情况 1 不适用
\begin{align} g(n) & = \sum_{i=0}^{t-1} a^i f \left( \frac{n}{b^i} \right) = \sum_{i=0}^{t-1} a^i \left( \frac{n}{b^i} \right)^{\log_b a} \bigg/ \lg \frac{n}{b^i} = \sum_{i=0}^{t-1} a^i \frac{n^{\log_b a}}{a^i} \bigg/ \lg \frac{n}{b^i} \\ & = n^{\log_b a} \sum_{i=0}^{t-1} \log_b 2 \bigg/ \log_b \frac{n}{b^i} \overset{n~=~b^t}{=} n^{\log_b a} \log_b 2 \sum_{i=0}^{t-1} \frac{1}{t-i} \\ & = n^{\log_b a} \log_b 2 \sum_{i=1}^t \frac{1}{i} \end{align}
考虑函数$h(x) = 1/x$的黎曼积分,易知
\begin{align} \ln (t+1) \le \sum_{i=1}^t \frac{1}{i} \le \ln t + 1 \Longrightarrow \sum_{i=1}^t \frac{1}{i} = \Theta(\ln t) = \Theta(\lg \lg n) \end{align}
故$g(n) = \Theta (n^{\log_b a} \lg \lg n)$占主导,从而$T(n) = \Theta (n^{\log_b a} \lg \lg n)$
Akra-Bazzi 法可以处理子问题非平衡的情形
\begin{align} T(n) = f(n) + \sum_{i=1}^k a_i \cdot T(n/b_i) \end{align}
其中$k \in \zb^+$、$a_1, \ldots, a_k > 0$、$b_1, \ldots, b_k > 1$
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
\begin{align} T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align}
注意$\sum_{i=1}^k a_i / b_i^p \rightarrow \begin{cases} 0, & p \rightarrow \infty \\ \infty, & p \rightarrow -\infty \end{cases}$,满足要求的$p$总是存在的
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
\begin{align} T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align}
例:$T(n) = T(n/3) + T(2n/3) + c n$
$a_1 = a_2 = 1$、$b_1 = 3$、$b_2 = 3/2$,$\sum_{i=1}^2 a_i / b_i = 1$,即$p = 1$
\begin{align} T(n) & = \Theta \left( n \left( 1 + \int_1^n \frac{cx}{x^2} \diff x \right) \right) = \Theta \left( n \left( 1 + c \int_1^n \frac{1}{x} \diff x \right) \right) \\[2px] & = \Theta ( n ( 1 + c \ln x |_1^n ) ) \\[2px] & = \Theta ( n ( 1 + c \ln n ) ) \\[2px] & = \Theta ( n \ln n ) \end{align}
设实数$p$满足$\sum_{i=1}^k a_i / b_i^p = 1$,则
\begin{align} T(n) = \Theta \left( n^p \left( 1 + \int_1^n \frac{f(x)}{x^{p+1}} \diff x \right) \right) \end{align}
主定理不适用例:$T(n) = 2 \cdot T(n/2) + n / \lg n$,$a = b = 2$
$n^{\log_b a} = n$,$n / \lg n \ne O(n^{1 - \epsilon})$,Akra-Bazzi 法的$p = 1$
\begin{align} T(n) & = \Theta \left( n \left( 1 + \int_1^n \frac{x}{x^2 \lg x} \diff x \right) \right) = \Theta \left( n \left( 1 + \int_1^n \frac{1}{x \lg x} \diff x \right) \right) \\[2px] & = \Theta \left( n \left( 1 + \int_1^n \frac{\ln 2}{\ln x} \diff \ln x \right) \right) = \Theta ( n ( 1 + \ln \ln n ) ) \\[2px] & = \Theta ( n \lg \lg n ) \end{align}
算法导论 3rd
2-4、4.3-2、4.3-9、4.4-6、4.5-1、4.5-4