@cleardusk 2015-11-27T11:38:23.000000Z 字数 3513 阅读 1756

# DL 在线书籍源码阅读（一）

GjzCVCode

# IO 部分

## 黑盒测试

training_data, validation_data, test_data = mnist_loader.load_data_wrapper()

In [15]: training_data[0][0][783]Out[15]: array([ 0.], dtype=float32)In [16]: training_data[0][0][784]---------------------------------------------------------------------------IndexError                                Traceback (most recent call last)<ipython-input-16-dd65c757aa33> in <module>()----> 1 training_data[0][0][784]IndexError: index 784 is out of bounds for axis 0 with size 784

training_data: [(x,y),(x,y),(x,y),(x,y),(x,y)...(x,y)]
training_data 是一个长度为 50000 的 list，element 是 tuple (x, y)，x 是 28*28=784 dimensions 的 image，在 Python 中是一维的 like-array，y 是 10 dimensions，比如 [0,1,0,0,0,0,0,0,0,0]，代表 1，可依此类推。

validation_data, test_data 与 training_data 结构相同，除了数量不同，三者分别是 50000, 10000, 10000。强调一点，训练的时候应该拿 validation_data 做测试，不能拿 test_data，用 test_data 容易 overfit，结果不靠谱。

# 代码部分

tr_d, va_d, te_d = mnist_loader.load_data()In [26]: tr_d[0][49999][783]Out[26]: 0.0In [27]: tr_d[1]Out[27]: array([5, 0, 4, ..., 8, 4, 8])In [28]: tr_d[1][0]Out[28]: 5

def load_data():    f = gzip.open('../data/mnist.pkl.gz', 'rb')    training_data, validation_data, test_data = cPickle.load(f)    f.close()    return (training_data, validation_data, test_data)

In [34]: import cPickleIn [35]: help(cPickle)NAME    cPickle - C implementation and optimization of the Python pickle module.FUNCTIONS    Pickler(...)    Pickler(file, protocol=0) -- Create a pickler.    Unpickler(...)    Unpickler(file) -- Create an unpickler.dump(...)    dump(obj, file, protocol=0) -- Write an object in pickle format to the given file.    See the Pickler docstring for the meaning of optional argument proto.dumps(...)    dumps(obj, protocol=0) -- Return a string containing an object in pickle format.    See the Pickler docstring for the meaning of optional argument proto.load(...)    load(file) -- Load a pickle from the given fileloads(...)    loads(string) -- Load a pickle from the given string

In [36]: import gzipIn [37]: help(gzip)NAME    gzip - Functions that read and write gzipped files.FUNCTIONS    open(filename, mode='rb', compresslevel=9)        Shorthand for GzipFile(filename, mode, compresslevel).        The filename argument is required; mode defaults to 'rb'        and compresslevel defaults to 9.

    f = gzip.open('../data/mnist.pkl.gz', 'rb')    training_data, validation_data, test_data = cPickle.load(f)    f.close()    return (training_data, validation_data, test_data)

    tr_d, va_d, te_d = load_data()    training_inputs = [np.reshape(x, (784, 1)) for x in tr_d[0]]    training_results = [vectorized_result(y) for y in tr_d[1]]    training_data = zip(training_inputs, training_results)    validation_inputs = [np.reshape(x, (784, 1)) for x in va_d[0]]    validation_data = zip(validation_inputs, va_d[1])    test_inputs = [np.reshape(x, (784, 1)) for x in te_d[0]]    test_data = zip(test_inputs, te_d[1])    return (training_data, validation_data, test_data)

• 私有
• 公开
• 删除