[关闭]
@natsumi 2017-05-20T15:21:11.000000Z 字数 2931 阅读 830

encog中BasicNetwork的权值初始化

机器学习


权值初始化的代码是这样的~
注释上说这个reset方法是用Nguyen-Widrow randomizer生成-1~1之间的随机值,重置权值矩阵和偏置值。但是如果网络小于三层,Nguyen-Widrow就无法使用,则用range randomize生成-1~1之间均匀分布的随机值初始化权值。

  1. //BasicNetwork.java
  2. /**
  3. * Reset the weight matrix and the bias values. This will use a
  4. * Nguyen-Widrow randomizer with a range between -1 and 1. If the network
  5. * does not have an input, output or hidden layers, then Nguyen-Widrow
  6. * cannot be used and a simple range randomize between -1 and 1 will be
  7. * used.
  8. *
  9. */
  10. @Override
  11. public void reset() {
  12. getRandomizer().randomize(this);
  13. }

从getRandomizer()方法可以看出除了层数小于3的网络不能用NW方法初始化,还有激活函数不是TANH、Sigmoid、Elliott类(Elliott和对称Elliott)函数的网络,也要使用Range randomizer初始化。

从代码中可以更明确的看到,网络中每层都是上面说的几种激活函数中的一个才能用NW方法,有任何一层不满足条件都将改用均匀分布的初始化。

  1. //BasicNetwork.java
  2. /**
  3. * ……
  4. * Range randomizer is also used if the activation function is not
  5. * TANH, Sigmoid, or the Elliott equivalents.
  6. *
  7. * @return the randomizer
  8. */
  9. private Randomizer getRandomizer() {
  10. boolean useNWR = true;
  11. for(int i=0;i<this.getLayerCount();i++) {
  12. ActivationFunction af = getActivation(i);
  13. if( af.getClass()!=ActivationSigmoid.class
  14. && af.getClass()!=ActivationTANH.class
  15. && af.getClass()!=ActivationElliott.class
  16. && af.getClass()!=ActivationElliottSymmetric.class) {
  17. useNWR = false;
  18. }
  19. }
  20. if (getLayerCount() < 3) {
  21. useNWR = false;
  22. }
  23. if (useNWR) {
  24. return new NguyenWidrowRandomizer();
  25. } else {
  26. return new RangeRandomizer(-1,1);
  27. }
  28. }

RangeRandomizer中没有实现randomize(MLMethod network)方法,所以到RangeRandomizer的基类BasicRandomizer中找,对于BasicNetwork是逐层初始化randomize(network, i),这个初始化第i层的方法中又调用了一个randomize(Double)方法,这个方法在RangeRandomizer中是有实现的。

  1. //BasicRandomizer.java
  2. /**
  3. * Randomize the synapses and biases in the basic network based on an array,
  4. * modify the array. Previous values may be used, or they may be discarded,
  5. * depending on the randomizer.
  6. *
  7. * @param method
  8. * A network to randomize.
  9. */
  10. @Override
  11. public void randomize(final MLMethod method) {
  12. if (method instanceof BasicNetwork) {
  13. final BasicNetwork network = (BasicNetwork) method;
  14. for (int i = 0; i < network.getLayerCount() - 1; i++) {
  15. randomize(network, i);
  16. }
  17. } else if (method instanceof MLEncodable) {
  18. final MLEncodable encode = (MLEncodable) method;
  19. final double[] encoded = new double[encode.encodedArrayLength()];
  20. encode.encodeToArray(encoded);
  21. randomize(encoded);
  22. encode.decodeFromArray(encoded);
  23. }
  24. }

randomize(Double)方法中又调了基类的nextDouble(final double min, final double max)方法,

  1. //RangeRandomizer.java
  2. /**
  3. * Generate a random number based on the range specified in the constructor.
  4. *
  5. * @param d
  6. * The range randomizer ignores this value.
  7. * @return The random number.
  8. */
  9. public double randomize(final double d) {
  10. return nextDouble(this.min, this.max);
  11. }

基类的random域是怎么赋值的呢,看构造方法。所以可以得到结论就是用MersenneTwister算法生成的-1~1之间的随机数来初始化网络的权值。

  1. //BasicRandomizer.java
  2. /**
  3. * Generate a random number in the specified range.
  4. *
  5. * @param min
  6. * The minimum value.
  7. * @param max
  8. * The maximum value.
  9. * @return A random number.
  10. */
  11. public final double nextDouble(final double min, final double max) {
  12. final double range = max - min;
  13. return (range * this.random.nextDouble()) + min;
  14. }
  15. /**
  16. * Construct a random number generator with a random(current time) seed. If
  17. * you want to set your own seed, just call "getRandom().setSeed".
  18. */
  19. public BasicRandomizer() {
  20. this.random = new MersenneTwisterGenerateRandom(System.nanoTime());
  21. }
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注