@geek-sjl 2018-09-23T03:46:37.000000Z 字数 2423 阅读 925

两种数列求积算法的分析比较[作业]

两种数列求积算法的分析比较[作业]

概要

比较以下两种数列求积方法

标杆算法（手动滑稽）

product1:

def product1(X):
    res = 1
    for i in range(len(X)):
        res *= X[i]
    return res

product2:

def product2(X):
    if len(X) == 0: return 1
    if len(X) == 1: return X[0]
    if len(X) == 2: return X[0] * X[1]
    l = len(X)
    return product2(X[0:(l + 1) // 2]) * product2(X[(l + 1) // 2: l])

新的尝试

product3:

def product3(X):
    now_len = len(X)
    while (now_len > 0):
        for i in range(0, now_len, 2):
            X[i // 2] = X[i] * X[i + 1]
        now_len /= 2
        if(now_len%2==1):
            X[now_len]=1
    return X[0]

fixme:
only the len of input equal 2^n could the code run.

分析

两种算法的区别

product1的朴素方法直接将每个元素相乘，所需做的乘法运算量为len（X)-1
product2的递归方法有分治的思想，先拆分成子问题再合并结果从而得出最终的结果，所需做的乘法运算量仍为len(X)-1
product3与product2在原理上没有区别，只是少了递归调用的消耗
各个product的区别在于运算顺序的不同
product1中为X1*X2*X3*...*Xn-1*Xn
product2中为(((X1*X2)*(X3*X4))...((Xn-3*Xn-2)*(Xn-1*Xn)))

不同的运算顺序带来的性能差异的原因：

product1的运算到后面会因为临时结果的数据过大而在大数相乘运算上耗费大量的时间，而product2则可以做较少的大数相乘
product2存在大量的递归操作，可能会因为系统运行时分配的栈空间不足而溢出，同时递归也会产生性能消耗
product3继避免了大量的大数运算，又可以避免递归带来的性能开销（尽管随着数据规模的增大递归带来的性能开销已经是缓慢增长的了）

结论猜测

len(X)规模较小时product1的朴素方法更为高效
当len(X)规模较大时product2的递归方法因为不需要更多的大数相乘运算而可以在更短的时间内完成运算（相较于递归产生的性能消耗已经随着输入规模的增大而可以忽略不计）

测试

测试采用的代码如下

import time
def product1(X):
    res = 1
    for i in range(len(X)):
        res *= X[i]
    return res


def product2(X):
    if len(X) == 0: return 1
    if len(X) == 1: return X[0]
    if len(X) == 2: return X[0] * X[1]
    l = len(X)
    return product2(X[0:(l + 1) // 2]) * product2(X[(l + 1) // 2: l])

def product3(X):
    now_len = len(X)
    while (now_len > 0):
        for i in range(0, now_len, 2):
            X[i // 2] = X[i] * X[i + 1]
        now_len /= 2
        if(now_len%2==1):
            X[now_len]=1
    return X[0]


X = []
for i in range(1,pow(2,n)+1):
    X.append(i)

starttime = time.clock()
print("thr res is %d" % product1(X))
endtime = time.clock()
print("the time cost of product1 is %lf" % (endtime - starttime))

starttime = time.clock()
print("thr res is %d" % product2(X))
endtime = time.clock()
print("the time cost of product2 is %lf" % (endtime - starttime))

starttime = time.clock()
print("thr res is %d" % product3(X))
endtime = time.clock()
print("the time cost of product3 is %lf" % (endtime - starttime))

以下为修改goal得到的测试结果

规模值(2^n)	product1耗时(s)	product2耗时(s)	product3耗时(s)
4	0.000015	0.000012	0.000009
8	0.000054	0.000111	0.000044
10	0.000478	0.003026	0.000315
12	0.011373	0.009662	0.009535
14	0.137139	0.086659	0.078398
15	0.549386	0.328016	0.320009
16	2.472980	1.435486	1.419114
17	10.915734	6.482502	6.369947
18	49.294245	28.352825	28.251242

由以上测试结果可以粗略表明，以上猜测的结论正确。

关于product3

从测试结果中可以看到，product3在较小数据规模的时候就已经跑赢了product1，和product2的差距甚至达到了常数级，但随着规模继续增大，product1继续阵亡，product2和product3的差距不断缩小，原因猜测和product2能在数据规模较大时优于product1相似，数据规模已经大到可以忽略递归带来的额外开销了。而且递归开销是成指数性增长，到后期增长已经较为缓慢了。