[关闭]
@natsumi 2017-01-22T15:42:12.000000Z 字数 14261 阅读 1012

Encog中的RBFNetwork

机器学习


encog的API
http://heatonresearch-site.s3-website-us-east-1.amazonaws.com/javadoc/encog-3.3/index.html

1. 构造函数

有三个

1.1 无参构造函数

  1. public RBFNetwork() {
  2. this.flat = new FlatNetworkRBF();
  3. }

这里调用的FlatNetworkRBF类的无参构造函数是空的~也就是说这个构造基本啥都没干~只是搭了个不能用的架子~还需要进一步设置

1.2 默认RBF中心和宽度的构造函数

  1. /**
  2. * Construct RBF network.
  3. *
  4. * @param inputCount
  5. * The input count.
  6. * @param hiddenCount
  7. * The hidden count.
  8. * @param outputCount
  9. * The output count.
  10. * @param t
  11. * The RBF type.
  12. */
  13. public RBFNetwork(final int inputCount, final int hiddenCount,
  14. final int outputCount, final RBFEnum t) {
  15. if (hiddenCount == 0) {
  16. throw new NeuralNetworkError(
  17. "RBF network cannot have zero hidden neurons.");
  18. }
  19. final RadialBasisFunction[] rbf = new RadialBasisFunction[hiddenCount];
  20. // Set the standard RBF neuron width.将径向基函数的宽度设为一个默认值
  21. // Literature seems to suggest this is a good default value.
  22. final double volumeNeuronWidth = 2.0 / hiddenCount;
  23. //根据输入层、隐层、输出层的节点数构造FlatNetwork,RBFNetwork只是一个封装,实际的网络在成员flat中
  24. this.flat = new FlatNetworkRBF(inputCount, rbf.length, outputCount, rbf);
  25. try {
  26. // try this
  27. //设置径向基函数
  28. setRBFCentersAndWidthsEqualSpacing(-1, 1, t, volumeNeuronWidth,
  29. false);
  30. } catch (final EncogError ex) {
  31. // if we have the wrong number of hidden neurons, try this
  32. randomizeRBFCentersAndWidths(-1, 1, t);
  33. }
  34. }

构造函数中调用setRBFCentersAndWidthsEqualSpacing(-1, 1, t, volumeNeuronWidth, false);设置径向基函数。在我们没有给出径向基函数的中心和宽度的时候,默认的处理方式是所有隐层节点等分n维变量空间。

  1. /**
  2. * Equally spaces all hidden neurons within the n dimensional variable
  3. * space.
  4. *
  5. * @param minPosition
  6. * The minimum position neurons should be centered. Typically 0.
  7. * @param maxPosition
  8. * The maximum position neurons should be centered. Typically 1
  9. * @param volumeNeuronRBFWidth
  10. * The neuron width of neurons within the mesh.
  11. * @param useWideEdgeRBFs
  12. * Enables wider RBF's around the boundary of the neuron mesh.
  13. */
  14. public void setRBFCentersAndWidthsEqualSpacing(final double minPosition,
  15. final double maxPosition, final RBFEnum t,
  16. final double volumeNeuronRBFWidth, final boolean useWideEdgeRBFs) {
  17. final int totalNumHiddenNeurons = this.flat.getRBF().length;
  18. final int dimensions = getInputCount();
  19. //minPosition和maxPosition表示的其实是每一维的最小和最大的中心,相减就是待等分的距离。构造函数中调用时传入参数为-1和1,disMinMaxPosition = 2
  20. final double disMinMaxPosition = Math.abs(maxPosition - minPosition);
  21. // Check to make sure we have the correct number of neurons for the
  22. // provided dimensions
  23. //由于是等分dimensions维空间,所以隐层节点数(也就是等分后的块数)= expectedSideLength^dimensions,expectedSideLength表示每一维的分割点数(比如disMinMaxPosition被三等分,则有3+1 = 4个分割点)。
  24. final int expectedSideLength = (int) Math.pow(totalNumHiddenNeurons,
  25. 1.0 / dimensions);
  26. final double cmp = Math.pow(totalNumHiddenNeurons, 1.0 / dimensions);
  27. if (expectedSideLength != cmp) {
  28. throw new NeuralNetworkError(
  29. "Total number of RBF neurons must be some integer to the power of 'dimensions'.\n"
  30. + Format.formatDouble(expectedSideLength, 5)
  31. + " <> " + Format.formatDouble(cmp, 5));
  32. }
  33. //volumeNeuronRBFWidth传入值为2.0/hidden count
  34. final double edgeNeuronRBFWidth = 2.5 * volumeNeuronRBFWidth;
  35. final double[][] centers = new double[totalNumHiddenNeurons][];
  36. final double[] widths = new double[totalNumHiddenNeurons];
  37. //循环计算每个隐层神经元的中心和宽度
  38. for (int i = 0; i < totalNumHiddenNeurons; i++) {
  39. centers[i] = new double[dimensions];
  40. //SideLength表示每一维的分割点数
  41. final int sideLength = expectedSideLength;
  42. // Evenly distribute the volume neurons.
  43. int temp = i;
  44. // First determine the centers
  45. for (int j = dimensions; j > 0; j--) {
  46. // i + j * sidelength + k * sidelength ^2 + ... l * sidelength ^
  47. // n
  48. // i - neuron number in x direction, i.e. 0,1,2,3
  49. // j - neuron number in y direction, i.e. 0,1,2,3
  50. // Following example assumes sidelength of 4
  51. // e.g Neuron 5 - x position is (int)5/4 * 0.33 = 0.33
  52. // then take modulus of 5%4 = 1
  53. // Neuron 5 - y position is (int)1/1 * 0.33 = 0.33
  54. centers[i][j - 1] = ((int) (temp / Math.pow(sideLength, j - 1)) * (disMinMaxPosition / (sideLength - 1)))
  55. + minPosition;
  56. temp = temp % (int) (Math.pow(sideLength, j - 1));
  57. }
  58. // Now set the widths
  59. boolean contains = false;
  60. for (int z = 0; z < centers[0].length; z++) {
  61. //或许这样更好
  62. //if ((centers[i][z] == minPosition) || (centers[i][z] == maxPosition)) {
  63. if ((centers[i][z] == 1.0) || (centers[i][z] == 0.0)) {
  64. contains = true;
  65. }
  66. }
  67. //useWideEdgeRBFs:布尔型,表示在neuron网格边缘使用更宽的RBF
  68. //所以cotains应该是表示节点i是否在边缘
  69. if (contains && useWideEdgeRBFs) {
  70. widths[i] = edgeNeuronRBFWidth;
  71. } else {
  72. widths[i] = volumeNeuronRBFWidth;
  73. }
  74. }
  75. setRBFCentersAndWidths(centers, widths, t);
  76. }

上面的例子是将二维空间进行等分,产生4^2 = 16个隐层节点中心。中心坐标最小值为0,最大值为1,对0~1这个区间进行三等分,产生的两个分割点分别是0.33和0.66。循环过程中i表示隐层节点编号。

  1. // Following example assumes sidelength of 4
  2. // e.g Neuron 5 - x position is (int)5/4 * 0.33 = 0.33
  3. center[5][1] = (int)5/4 * 0.33 = 0.33
  4. // then take modulus of 5%4 = 1
  5. temp = 5%4 = 1
  6. // Neuron 5 - y position is (int)1/1 * 0.33 = 0.33
  7. center[5][0] = (int)1/1 * 0.33 = 0.33

1.3 定制RBF构造函数

自己设置好径向基函数数组rbf,然后作为参数传入构造函数,构造RBF网络

  1. /**
  2. * Construct RBF network.
  3. *
  4. * @param inputCount
  5. * The input count.
  6. * @param outputCount
  7. * The output count.
  8. * @param rbf
  9. * The RBF type.
  10. */
  11. public RBFNetwork(final int inputCount, final int outputCount,
  12. final RadialBasisFunction[] rbf) {
  13. this.flat = new FlatNetworkRBF(inputCount, rbf.length, outputCount, rbf);
  14. this.flat.setRBF(rbf);
  15. }

可以参考setRBFCentersAndWidths设置rbf数组中的径向基函数。

  1. /**
  2. * Array containing center position. Row n contains centers for neuron n.
  3. * Row n contains x elements for x number of dimensions.
  4. *
  5. * @param centers
  6. * The centers.
  7. * @param widths
  8. * Array containing widths. Row n contains widths for neuron n.
  9. * Row n contains x elements for x number of dimensions.
  10. * @param t
  11. * The RBF Function to use for this layer.
  12. */
  13. public void setRBFCentersAndWidths(final double[][] centers,
  14. final double[] widths, final RBFEnum t) {
  15. for (int i = 0; i < this.flat.getRBF().length; i++) {
  16. setRBFFunction(i, t, centers[i], widths[i]);
  17. }
  18. }

2. 网络权值的设置

2.1 reset函数

RBFNetwork中提供了两个reset函数用于重置(随机初始化)权值

  1. /**
  2. * Reset the weights.
  3. */
  4. @Override
  5. public void reset() {
  6. (new RangeRandomizer(-1, 1)).randomize(this);
  7. }
  8. /**
  9. * Reset the weights with a seed.
  10. */
  11. @Override
  12. public void reset(int seed) {
  13. ConsistentRandomizer randomizer = new ConsistentRandomizer(-1, 1, seed);
  14. randomizer.randomize(this);
  15. }

这两个reset函数中使用了不同的类型的randomizer,但是这两个randomizer中的randomize函数都没有可以用this作为参数的实现,所以其实调用的是同一个基类BasicRandomizer中的randomize函数。ConsistentRandomizer和RangeRandomizer功能基本相同,只是ConsistentRandomizer可以自行设置seed,产生的数字是伪随机数,可以复现。

this的类型是RBFNetwork,继承了BasicML 并实现了MLError, MLRegression, ContainsFlat, MLResettable, MLEncodable这些接口。所以属于instanceof MLEncodable

  1. //BasicRandomizer.java
  2. /**
  3. * Randomize the synapses and biases in the basic network based on an array,
  4. * modify the array. Previous values may be used, or they may be discarded,
  5. * depending on the randomizer.
  6. *
  7. * @param method
  8. * A network to randomize.
  9. */
  10. @Override
  11. public void randomize(final MLMethod method) {
  12. if (method instanceof BasicNetwork) {
  13. final BasicNetwork network = (BasicNetwork) method;
  14. for (int i = 0; i < network.getLayerCount() - 1; i++) {
  15. randomize(network, i);
  16. }
  17. } else if (method instanceof MLEncodable) {//RBFNetwork属于这一类
  18. final MLEncodable encode = (MLEncodable) method;
  19. final double[] encoded = new double[encode.encodedArrayLength()];
  20. encode.encodeToArray(encoded);
  21. randomize(encoded);
  22. encode.decodeFromArray(encoded);
  23. }
  24. }

这里其实是把网络编码成一个double数组,用randomize方法重置这个数组中的数,然后将数组解码产生重置后的RBF网络。

  1. //BasicRandomizer.java
  2. /**
  3. * Randomize the array based on an array, modify the array. Previous values
  4. * may be used, or they may be discarded, depending on the randomizer.
  5. *
  6. * @param d
  7. * An array to randomize.
  8. */
  9. @Override
  10. public void randomize(final double[] d) {
  11. randomize(d, 0, d.length);
  12. }
  13. /**
  14. * Randomize the array based on an array, modify the array. Previous values
  15. * may be used, or they may be discarded, depending on the randomizer.
  16. *
  17. * @param d
  18. * An array to randomize.
  19. * @param begin
  20. * The beginning element of the array.
  21. * @param size
  22. * The size of the array to copy.
  23. */
  24. @Override
  25. public void randomize(final double[] d, final int begin,
  26. final int size) {
  27. for (int i = 0; i < size; i++) {
  28. d[begin + i] = randomize(d[begin + i]);
  29. }
  30. }

注释写的很明白了,原来的RBF网络编码产生的数组encoded被传入了randomize(final double[] d)方法,原来的数值可能会被使用也可能会被丢弃。这取决于不同的randomizer。因为randomize(final double d)方法在BasicRandomizer中没有实现,是由各个子类实现的。
在RangeRandomizer的实现如下,直接忽略了原来的数值,在range范围内重新产生新的随机数。ConsistentRandomizer的实现也是类似的,只是多了一个seed。

  1. //RangeRandomizer.java
  2. /**
  3. * Generate a random number based on the range specified in the constructor.
  4. *
  5. * @param d
  6. * The range randomizer ignores this value.
  7. * @return The random number.
  8. */
  9. public double randomize(final double d) {
  10. return nextDouble(this.min, this.max);
  11. }

2.2 encodeToArray函数

上面的reset过程基本上清晰了,就只剩下编码解码的函数了。

  1. /**
  2. * Defines a Machine Learning Method that can be encoded to a double array.
  3. * This is very useful for certain training, such as genetic algorithms
  4. * and simulated annealing.
  5. *
  6. */
  7. public interface MLEncodable extends MLMethod {
  8. /**
  9. * @return The length of an encoded array.
  10. */
  11. int encodedArrayLength();
  12. /**
  13. * Encode the object to the specified array.
  14. * @param encoded The array.
  15. */
  16. void encodeToArray(double[] encoded);
  17. /**
  18. * Decode an array to this object.
  19. * @param encoded The encoded array.
  20. */
  21. void decodeFromArray(double[] encoded);
  22. }

下面主要看看RBFNetwork对这几个接口方法的实现。

  1. /**
  2. * 计算数组长度
  3. */
  4. @Override
  5. public int encodedArrayLength() {
  6. int result = this.getFlat().getWeights().length;//所有的权值
  7. for (RadialBasisFunction rbf : flat.getRBF()) {
  8. result += rbf.getCenters().length + 1;//加上每个rbf的中心和宽度
  9. }
  10. return result;
  11. }

从长度的计算中可以看出数组中包含哪些东西。其实就是所有的权值和每个rbf的中心和宽度。权值存在this.flat中。网络中的权值总共多少个以及如何排列还要看FlatNetworkRBF类中的init函数。
下面的编码函数中可以看到数组中这些内容的排列顺序。
权值数组+rbf[0]宽度+rbf[0]中心数组+rbf1宽度+rbf1中心数组……
decode函数就是与encode相反的工作~

  1. /**
  2. * {@inheritDoc}
  3. */
  4. @Override
  5. public void encodeToArray(double[] encoded) {
  6. EngineArray.arrayCopy(getFlat().getWeights(), 0, encoded, 0, getFlat()
  7. .getWeights().length);
  8. int index = getFlat().getWeights().length;
  9. for (RadialBasisFunction rbf : flat.getRBF()) {
  10. encoded[index++] = rbf.getWidth();
  11. EngineArray.arrayCopy(rbf.getCenters(), 0, encoded, index,
  12. rbf.getCenters().length);
  13. index += rbf.getCenters().length;
  14. }
  15. }

2.3 权值的排列

网络中的权值总共多少个以及如何排列要看FlatNetworkRBF类中的init函数。
init函数先给各种count赋了值~

关于激活函数:init函数的实现是在FlatNetwork类中,是各种网络通用的

  1. //FlatNetwork.java
  2. /**
  3. * Construct a flat network.
  4. *
  5. * @param layers
  6. * The layers of the network to create.
  7. */
  8. public void init(final FlatLayer[] layers) {
  9. final int layerCount = layers.length;
  10. this.inputCount = layers[0].getCount();
  11. this.outputCount = layers[layerCount - 1].getCount();
  12. this.layerCounts = new int[layerCount];//各层的神经元数
  13. this.layerContextCount = new int[layerCount];//各层的上下文神经元数
  14. this.weightIndex = new int[layerCount];//各层权值在权值数组中的起始index
  15. this.layerIndex = new int[layerCount];//各层神经元的起始index
  16. this.activationFunctions = new ActivationFunction[layerCount];//各层的激活函数
  17. this.layerFeedCounts = new int[layerCount];//从前面一层接收输入的神经元数(偏置神经元和上下文神经元不属于这类)
  18. //下面这两个数组看后面怎么赋值
  19. this.contextTargetOffset = new int[layerCount];//The context target for each layer. This is how the backwards connections are formed for the recurrent neural network. Each layer either has a zero, which means no context target, or a layer number that indicates the target layer.
  20. this.contextTargetSize = new int[layerCount];//The size of each of the context targets. If a layer's contextTargetOffset is zero, its contextTargetSize should also be zero. The contextTargetSize should always match the feed count of the targeted context layer.
  21. this.biasActivation = new double[layerCount];//每层的偏置激励,一般1表示有偏置节点,0表示没有
  22. int index = 0;
  23. int neuronCount = 0;
  24. int weightCount = 0;
  25. //循环顺序是i从大到小,也就是从输出层到输入层
  26. //但是index是从0开始,所以index = 0表示的是输出层,index = layers.length - 1表示的是输入层
  27. for (int i = layers.length - 1; i >= 0; i--) {
  28. final FlatLayer layer = layers[i];
  29. FlatLayer nextLayer = null;
  30. if (i > 0) {
  31. nextLayer = layers[i - 1];
  32. }
  33. this.biasActivation[index] = layer.getBiasActivation();
  34. this.layerCounts[index] = layer.getTotalCount();
  35. this.layerFeedCounts[index] = layer.getCount();
  36. this.layerContextCount[index] = layer.getContextCount();
  37. this.activationFunctions[index] = layer.getActivation();
  38. neuronCount += layer.getTotalCount();
  39. if (nextLayer != null) {
  40. weightCount += layer.getCount() * nextLayer.getTotalCount();
  41. }
  42. if (index == 0) {
  43. this.weightIndex[index] = 0;
  44. this.layerIndex[index] = 0;
  45. } else {
  46. this.weightIndex[index] = this.weightIndex[index - 1]
  47. + (this.layerCounts[index] * this.layerFeedCounts[index - 1]);
  48. this.layerIndex[index] = this.layerIndex[index - 1]
  49. + this.layerCounts[index - 1];
  50. }
  51. int neuronIndex = 0;
  52. for (int j = layers.length - 1; j >= 0; j--) {
  53. if (layers[j].getContextFedBy() == layer) {
  54. this.hasContext = true;
  55. this.contextTargetSize[index] = layers[j].getContextCount();//表示从index层获得反馈输入的上下文神经元的个数
  56. this.contextTargetOffset[index] = neuronIndex
  57. + (layers[j].getTotalCount() - layers[j]
  58. .getContextCount());//表示从index层获得反馈输入的上下文神经元的起始标号
  59. //这两个值全为零表示没有神经元从index层获得反馈输入
  60. }
  61. neuronIndex += layers[j].getTotalCount();
  62. }
  63. index++;
  64. }
  65. this.beginTraining = 0;
  66. this.endTraining = this.layerCounts.length - 1;
  67. this.weights = new double[weightCount];
  68. this.layerOutput = new double[neuronCount];//神经元的输出
  69. this.layerSums = new double[neuronCount];//神经元获得的输入之和,再经过激活函数的作用就可以产生真正的输出layerOutput
  70. clearContext();
  71. }

普通的神经网络的计算(如XorHelloWorld中的BasicNetwork)是循环调用FlatNetwork中层运算函数,逐层计算,得到最终的输出。

  1. //FlatNetwork.java
  2. /**
  3. * Calculate a layer.
  4. *
  5. * @param currentLayer
  6. * The layer to calculate.
  7. */
  8. protected void computeLayer(final int currentLayer) {
  9. final int inputIndex = this.layerIndex[currentLayer];
  10. final int outputIndex = this.layerIndex[currentLayer - 1];
  11. final int inputSize = this.layerCounts[currentLayer];
  12. final int outputSize = this.layerFeedCounts[currentLayer - 1];
  13. int index = this.weightIndex[currentLayer - 1];
  14. final int limitX = outputIndex + outputSize;
  15. final int limitY = inputIndex + inputSize;
  16. // weight values
  17. for (int x = outputIndex; x < limitX; x++) {
  18. double sum = 0;
  19. for (int y = inputIndex; y < limitY; y++) {
  20. sum += this.weights[index++] * this.layerOutput[y];
  21. }
  22. this.layerSums[x] = sum;
  23. this.layerOutput[x] = sum;
  24. }
  25. this.activationFunctions[currentLayer - 1].activationFunction(
  26. this.layerOutput, outputIndex, outputSize);
  27. // update context values
  28. final int offset = this.contextTargetOffset[currentLayer];
  29. EngineArray.arrayCopy(this.layerOutput, outputIndex,
  30. this.layerOutput, offset, this.contextTargetSize[currentLayer]);
  31. }

可以知道权值的排列如下图
encog框架下RBF网络权值数组中权值的排列~有空画个好看的

3 RBF激活函数如何作用于网络

实际的RBF网络封装在FlatNetworkRBF类中,其主要的构造函数如下

  1. //FlatNetworkRBF.java
  2. /**
  3. * Construct an RBF flat network.
  4. *
  5. * @param inputCount
  6. * The number of input neurons. (also the number of dimensions)
  7. * @param hiddenCount
  8. * The number of hidden neurons.
  9. * @param outputCount
  10. * The number of output neurons.
  11. * @param rbf
  12. * The radial basis functions to use.
  13. */
  14. public FlatNetworkRBF(final int inputCount, final int hiddenCount,
  15. final int outputCount, final RadialBasisFunction[] rbf) {
  16. FlatLayer[] layers = new FlatLayer[3];
  17. this.rbf = rbf;
  18. layers[0] = new FlatLayer(new ActivationLinear(), inputCount, 0.0);
  19. layers[1] = new FlatLayer(new ActivationLinear(), hiddenCount, 0.0);
  20. layers[2] = new FlatLayer(new ActivationLinear(), outputCount, 0.0);
  21. init(layers);
  22. }

创建Layer的时候,第一个参数是激活函数,这里全部用了ActivationLinear,在这个类的注释中可以看到,它只是把输入的值传下去,没有做任何处理。

The Linear layer is really not an activation function at all. The input is simply passed on, unmodified, to the output.

为什么RBF网络有激活函数而这里全部创建成没有激活函数的层呢?因为FlatLayer是普通网络的层,而径向基函数作为激活函数是由FlatNetworkRBF自己维护的。

RBF网络输出的计算是比较特殊的,计算过程在下面的函数中。输入层与隐层之间的连接权值没有意义,全部置零。

  1. //FlatRBFNetwork.java
  2. /**
  3. * Calculate the output for the given input.
  4. *
  5. * @param x
  6. * The input.
  7. * @param output
  8. * Output will be placed here.
  9. */
  10. @Override
  11. public void compute(final double[] x, final double[] output) {
  12. int outputIndex = this.getLayerIndex()[1];
  13. //计算隐层节点输出
  14. for (int i = 0; i < rbf.length; i++) {
  15. double o = this.rbf[i].calculate(x);
  16. this.getLayerOutput()[outputIndex + i] = o;
  17. }
  18. // now compute the output
  19. //在计算输出层的输出的时候和BasicNetwork是一样的,于是调用FlatNetwork中的层运算函数
  20. computeLayer(1);
  21. EngineArray.arrayCopy(this.getLayerOutput(), 0, output, 0, this
  22. .getOutputCount());
  23. }
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注