Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
5bba9dd
Create ParallelANN.scala
bgreeven Jul 3, 2014
5874743
Create GeneralizedSteepestDescendAlgorithm
bgreeven Jul 3, 2014
8c3ff4a
Create TestParallelANN.scala
bgreeven Jul 3, 2014
96a0970
Create TestParallelANNgraphics.scala
bgreeven Jul 3, 2014
69b0e59
Update TestParallelANN.scala
bgreeven Jul 3, 2014
3f528b9
Update TestParallelANN.scala
bgreeven Jul 3, 2014
b1972b1
Update TestParallelANNgraphics.scala
bgreeven Jul 3, 2014
dd79615
Update GeneralizedSteepestDescendAlgorithm
bgreeven Jul 30, 2014
1f6de6a
Update ParallelANN.scala
bgreeven Jul 30, 2014
011c10b
Update GeneralizedSteepestDescendAlgorithm
bgreeven Jul 30, 2014
e7e29aa
Update TestParallelANN.scala
bgreeven Jul 30, 2014
100ad4b
Update TestParallelANNgraphics.scala
bgreeven Jul 30, 2014
c9fc3f4
Rename GeneralizedSteepestDescendAlgorithm to GeneralizedSteepestDesc…
bgreeven Aug 1, 2014
78f99dc
Update TestParallelANNgraphics.scala
bgreeven Aug 1, 2014
43103f0
Update and rename GeneralizedSteepestDescendAlgorithm.scala to Genera…
bgreeven Aug 21, 2014
2ecc7d5
Update ParallelANN.scala
bgreeven Aug 21, 2014
d80fe63
Update TestParallelANN.scala
bgreeven Aug 21, 2014
149a726
Update TestParallelANNgraphics.scala
bgreeven Aug 21, 2014
ace988e
Create mllib-ann.md
bgreeven Aug 22, 2014
9f75f59
Update mllib-ann.md
bgreeven Aug 22, 2014
c81de0c
Update mllib-ann.md
bgreeven Aug 22, 2014
3c456b5
Update mllib-ann.md
bgreeven Aug 22, 2014
3807e73
Update mllib-ann.md
bgreeven Aug 22, 2014
5236a9d
Update mllib-ann.md
bgreeven Aug 22, 2014
443ea7e
Update and rename GeneralizedSteepestDescentAlgorithm.scala to Genera…
bgreeven Sep 2, 2014
3466f95
Update ParallelANN.scala
bgreeven Sep 2, 2014
aed39c6
Update TestParallelANN.scala
bgreeven Sep 2, 2014
1972c69
ANN test suite: learning XOR function
avulanov Sep 5, 2014
d04c1d6
Removing dependency on GeneralizedModel and Algorithm
avulanov Sep 9, 2014
bd4508b
Addressing reviewers comments: interface refactoring
avulanov Sep 9, 2014
c032476
Apache header
avulanov Sep 9, 2014
3e90c4d
Update ArtificialNeuralNetwork.scala
bgreeven Sep 10, 2014
71ca727
Update and rename TestParallelANN.scala to TestANN.scala
bgreeven Sep 10, 2014
293d013
Delete TestParallelANNgraphics.scala
bgreeven Sep 10, 2014
18ac979
Update ArtificialNeuralNetwork.scala
bgreeven Sep 15, 2014
daf1375
Update ANNSuite.scala
bgreeven Sep 15, 2014
5e3345c
minor style fixes
avulanov Sep 17, 2014
6c657c3
Forward propagation code sharing
avulanov Sep 17, 2014
5ab0263
Update ArtificialNeuralNetwork.scala
bgreeven Sep 22, 2014
577a13a
Update ANNSuite.scala
bgreeven Sep 22, 2014
d048878
Delete TestANN.scala
bgreeven Sep 22, 2014
90195fa
Create ANNDemo.scala
bgreeven Sep 22, 2014
7c90249
Update mllib-ann.md
bgreeven Sep 22, 2014
87f630b
Update mllib-ann.md
bgreeven Sep 22, 2014
986f37a
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
8e3e2d5
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
d2b80fe
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
1a1c10b
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
2a9554b
Update mllib-ann.md
bgreeven Sep 25, 2014
40197ef
Update ANNDemo.scala
bgreeven Sep 25, 2014
589205f
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
6390947
Update ANNSuite.scala
bgreeven Sep 25, 2014
abfb0f5
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
039df76
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
aff66ae
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
e78dcd6
Minor style fixes
avulanov Sep 26, 2014
ccbed58
Unit test parameter
avulanov Sep 26, 2014
e3dc003
Update ANNSuite.scala
bgreeven Sep 28, 2014
dd47d75
ANN classifier draft
avulanov Oct 28, 2014
3e7eca1
Update ArtificialNeuralNetwork.scala
bgreeven Oct 28, 2014
f8d5a05
Update ANNSuite.scala
bgreeven Oct 28, 2014
57b9147
XOR classification test with draft
avulanov Oct 28, 2014
c189bb2
ANN classifier refactoring in progress: need random weight function
avulanov Oct 28, 2014
c4baf79
Minor stylefix, add additional function for customized initial weights
avulanov Oct 28, 2014
d0836ed
Model as a parameters for classifier
avulanov Oct 31, 2014
01bbca0
Scala style fix
avulanov Nov 5, 2014
c7e5323
Encoding of output with 0.1 and 0.9 by bgreeven suggestion
avulanov Dec 6, 2014
90f5ae5
Addressing bgreeven comment regarding labels sort, annotations
avulanov Dec 10, 2014
243e667
Create ParallelANN.scala
bgreeven Jul 3, 2014
96ba82a
Create GeneralizedSteepestDescendAlgorithm
bgreeven Jul 3, 2014
576ef79
Create TestParallelANN.scala
bgreeven Jul 3, 2014
c5cb54d
Create TestParallelANNgraphics.scala
bgreeven Jul 3, 2014
1af7f25
Update TestParallelANN.scala
bgreeven Jul 3, 2014
99f0581
Update TestParallelANN.scala
bgreeven Jul 3, 2014
b01fc3c
Update TestParallelANNgraphics.scala
bgreeven Jul 3, 2014
cae6dc2
Update GeneralizedSteepestDescendAlgorithm
bgreeven Jul 30, 2014
9eee6f1
Update ParallelANN.scala
bgreeven Jul 30, 2014
fec8691
Update GeneralizedSteepestDescendAlgorithm
bgreeven Jul 30, 2014
060ae3a
Update TestParallelANN.scala
bgreeven Jul 30, 2014
d1619c8
Update TestParallelANNgraphics.scala
bgreeven Jul 30, 2014
7c3a5b3
Rename GeneralizedSteepestDescendAlgorithm to GeneralizedSteepestDesc…
bgreeven Aug 1, 2014
fef4776
Update TestParallelANNgraphics.scala
bgreeven Aug 1, 2014
c086751
Update and rename GeneralizedSteepestDescendAlgorithm.scala to Genera…
bgreeven Aug 21, 2014
21d95d0
Update ParallelANN.scala
bgreeven Aug 21, 2014
d4764a4
Update TestParallelANN.scala
bgreeven Aug 21, 2014
4623f25
Update TestParallelANNgraphics.scala
bgreeven Aug 21, 2014
10242b7
Create mllib-ann.md
bgreeven Aug 22, 2014
402ad79
Update mllib-ann.md
bgreeven Aug 22, 2014
07218eb
Update mllib-ann.md
bgreeven Aug 22, 2014
f7cfa4e
Update mllib-ann.md
bgreeven Aug 22, 2014
d3211db
Update mllib-ann.md
bgreeven Aug 22, 2014
51ca78b
Update mllib-ann.md
bgreeven Aug 22, 2014
ceaf2f7
Update and rename GeneralizedSteepestDescentAlgorithm.scala to Genera…
bgreeven Sep 2, 2014
6f79c96
Update ParallelANN.scala
bgreeven Sep 2, 2014
2972747
Update TestParallelANN.scala
bgreeven Sep 2, 2014
6740981
ANN test suite: learning XOR function
avulanov Sep 5, 2014
c22c3dc
Removing dependency on GeneralizedModel and Algorithm
avulanov Sep 9, 2014
d320d76
Addressing reviewers comments: interface refactoring
avulanov Sep 9, 2014
181c29b
Apache header
avulanov Sep 9, 2014
7ac9a67
Update ArtificialNeuralNetwork.scala
bgreeven Sep 10, 2014
8e0dc8b
Update and rename TestParallelANN.scala to TestANN.scala
bgreeven Sep 10, 2014
0a3fca6
Delete TestParallelANNgraphics.scala
bgreeven Sep 10, 2014
50ca819
Update ArtificialNeuralNetwork.scala
bgreeven Sep 15, 2014
c2da9b0
Update ANNSuite.scala
bgreeven Sep 15, 2014
73ba0dc
minor style fixes
avulanov Sep 17, 2014
a024c6b
Forward propagation code sharing
avulanov Sep 17, 2014
95e5299
Update ArtificialNeuralNetwork.scala
bgreeven Sep 22, 2014
5a3531b
Update ANNSuite.scala
bgreeven Sep 22, 2014
85050ba
Delete TestANN.scala
bgreeven Sep 22, 2014
5f51305
Create ANNDemo.scala
bgreeven Sep 22, 2014
95ed2a2
Update mllib-ann.md
bgreeven Sep 22, 2014
4b83de4
Update mllib-ann.md
bgreeven Sep 22, 2014
7828327
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
84ac2e8
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
e2e94b2
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
a7fb749
Update ArtificialNeuralNetwork.scala
bgreeven Sep 23, 2014
3995be8
Update mllib-ann.md
bgreeven Sep 25, 2014
b44aec3
Update ANNDemo.scala
bgreeven Sep 25, 2014
6265bd6
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
325ffab
Update ANNSuite.scala
bgreeven Sep 25, 2014
099ff85
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
1c0aab4
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
5db2b60
Update ArtificialNeuralNetwork.scala
bgreeven Sep 25, 2014
b13019a
Minor style fixes
avulanov Sep 26, 2014
e2d4e92
Unit test parameter
avulanov Sep 26, 2014
fefe08e
Update ANNSuite.scala
bgreeven Sep 28, 2014
2fbbe23
Update ArtificialNeuralNetwork.scala
bgreeven Oct 28, 2014
57565ae
Update ANNSuite.scala
bgreeven Oct 28, 2014
bd74834
Minor stylefix, add additional function for customized initial weights
avulanov Oct 28, 2014
12fb903
Update mllib-ann.md
bgreeven Nov 3, 2014
a0d1da0
Update ArtificialNeuralNetwork.scala
bgreeven Nov 3, 2014
9b10666
Update mllib-ann.md
bgreeven Dec 10, 2014
3cf5f9b
Fixes after rebase
avulanov Dec 16, 2014
398e3dd
Matrix form of back-propagation based on avulanov/spark/tree/neuralne…
avulanov Dec 19, 2014
62b1d91
Fix of broken gradient test
avulanov Dec 19, 2014
3f93a2a
Roll/unroll ordering, weight by layer function
avulanov Dec 20, 2014
2fb67f6
Roll and cumulative update optimizations
avulanov Dec 23, 2014
799b277
Update ANNSuite.scala
loachli Dec 27, 2014
6166ad9
Batch ANN
avulanov Jan 10, 2015
5205fda
ANN Classifier batch
avulanov Jan 12, 2015
e660ee8
Divisor fix, train interfaces
avulanov Jan 14, 2015
d18e9b5
Test Context fix
avulanov Jan 23, 2015
5de5bad
Bias averaging fix
avulanov Feb 2, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Create mllib-ann.md
Documentation for Artificial Neural Network (ANN)
  • Loading branch information
bgreeven authored and avulanov committed Jan 23, 2015
commit ace988ebccd2d1b25b9f24bf917ee81853264395
179 changes: 179 additions & 0 deletions docs/mllib-ann.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
layout: global
title: Linear Methods - MLlib
displayTitle: <a href="mllib-guide.html">MLlib</a> - Linear Methods
---

* Table of contents
{:toc}

### Introduction

This document describes the MLLIB's Artificial Neural Network (ANN) implementation.

The implementation currently consist of the following files:

* 'ParallelANN.scala': implements the ANN
* 'GeneralizedSteepestDescentAlgorithm.scala': provides an abstract class and model as basis for 'ParallelANN'.

In addition, there is a demo/test available:

* 'TestParallelANN.scala': tests parallel ANNs for various functions
* 'TestParallelANNgraphics.scala': graphical output for 'TestParallelANN.scala'

### Architecture and Notation

The file ParallelANN.scala implements a three-layer ANN with the following architecture:


+-------+
| |
| X_0 |
| |
+-------+ +-------+
| |
+-------+ | H_0 | +-------+
| | | | | |
| X_1 |- +-------+ ->| O_0 |
| | \ Vij / | |
+-------+ - +-------+ - +-------+
\ | | / Wjk
: ->| H_1 |- +-------+
: | | | |
: +-------+ | O_1 |
: | |
: : +-------+
: :
: : :
: :
: : +-------+
: : | |
: : | O_K-1 |
: | |
: +-------+ +-------+
: | |
: | H_J-1 |
| |
+-------+ +-------+
| |
| X_I-1 |
| |
+-------+

+-------+ +--------+
| | | |
| -1 | | -1 |
| | | |
+-------+ +--------+

INPUT LAYER HIDDEN LAYER OUTPUT LAYER


The nodes '$X_0$' to '$X_{I-1}$' are the '$I$' input nodes. The nodes '$H_0$' to '$H_{J-1}$' are the '$J$' hidden nodes and the nodes '$O_0$' to '$O_{K-1}$' are the '$K$' output nodes. Between each input node '$X_i$' and hidden node '$H_j$' there is a weight '$V_{ij}$'. Likewise, between each hidden node '$H_j$' and each output node '$O_k$' is a weight '$W_{jk}$'.

The ANN also implements two bias units. These are nodes that always output the value -1. The bias units are in the input and in the hidden layer. They act as normal nodes, except that the bias unit in the hidden layer has no input. The bias units can also be denoted by '$X_I$' and '$H_J$'.

The value of a hidden node '$H_j$' is calculated as follows:

'$H_j = g ( \sum_{i=0}^{I} X_i*V_{i,j} )$'

Likewise, the value of the output node '$O_k$' is calculated as follows:

'$O_k = g( \sum_{j=0}^{J} H_j*W_{j,k} )$'

Where '$g$' is the sigmod function

'$g(t) = \frac{e^{\beta t} }{1+e^{\beta t}}$'

and '$\beta' the learning rate.

### Gradient descent

Currently, the MLLIB uses gradent descent for training. This means that the weights '$V_{ij}$' and '$W_{jk}$' are updated by adding a fraction of the gradient to '$V_{ij}$' and '$W_{jk}$' of the following function:

'$E = \sum_{k=0}^{K-1} (O_k - Y_k )^2$'

where '$Y_k$' is the target output given inputs '$X_0$' ... '$X_{I-1}$'

Calculations provide that:

'$\frac{\partial E}{\partial W_{jk}} = 2 (O_k-Y_k) \cdot H_j \cdot g' \left( \sum_{m=0}^{J} W_{mk} H_m \right)$'

and

'$\frac{\partial E}{\partial V_{ij}} = 2 \sum_{k=0}^{K-1} \left( (O_k - Y_k) \cdot X_i \cdot W_{jk} \cdot g'\left( \sum_{n=0}^{J} W_{nk} H_n \right) g'\left( \sum_{m=0}^{I} V_{mj} X_i \right) \right)$'

The training step consists of the two operations

'$V_{ij} = V_{ij} - \epsilon \frac{\partial E}{\partial V_{ij}}$'

and

'$W_{jk} = W_{jk} - \epsilon \frac{\partial E}{\partial W_{jk}}$'

where '$\epsilon$' is the step size.

### Implementation Details

## The 'ParallelANN' class

The 'ParallelANN' class is the main class of the ANN. This class uses a trait 'ANN', which includes functions for calculating the hidden layer ('computeHidden') and calculation of the output ('computeValues'). The output of 'computeHidden' includes the bias node in the hidden layer, such that it does not need to handle the hidden bias node differently.

The 'ParallelANN' class has the following constructors:

'ParallelANN( stepSize, numIterations, miniBatchFraction, noInput, noHidden, noOutput, beta )'
'ParallelANN()': assumes 'stepSize'=1.0, 'numIterations'=100, 'miniBatchFraction'=1.0, 'noInput'=1, 'noHidden'=5, 'noOutput'=1, 'beta'=1.0.
'ParallelANN( noHidden )': as 'ParallelANN()', but allows specification of 'noHidden'
'ParallelANN( noInput, noHidden )': as 'ParallelANN()', but allows specification of number of 'noInput' and 'noHidden'
'ParallelANN( noInput, noHidden, noOutput )': as 'ParallelANN()', but allows specification of 'noInput', 'noHidden' and 'noOutput'

The number of input nodes '$I$' is stored in the variable 'noInput', the number of hidden nodes '$J$' is stored in 'noHidden' and the number of output nodes '$K$' is stored in 'noOutput'. 'beta' contains the value of '$\beta$' for the sigmoid function.

The parameters 'stepSize', 'numIterations' and 'miniBatchFraction' are of use for the Statistical Gradient Descent function.

In addition, it has a single vector 'weights' corresponding to $V_{ij}$ and $W_{jk}$. The mapping of '$V_{ij}$' and '$W_{jk}$' into 'weights' is as follows:

'$V_{ij}$' -> 'weights[ i + j*(noInput+1) ]$'

'$W_{jk}$' -> 'weights[ (noInput+1)*noHidden + j + k*(noHidden+1) ]$'

The training function carries the name 'train'. It can take various inputs:

'def train( rdd: RDD[(Vector,Vector)] )': starts a complete new training session and generates a new ANN.
'def train( rdd: RDD[(Vector,Vector)], model: ParallelANNModel )': continues a training session with an existing ANN.
'def train( rdd: RDD[(Vector,Vector)], weights: Vector )': starts a training session using initial weights as indicated by 'weights'.

The input of the training function is an RDD with (input/output) training pairs, each input and output being stored as a 'Vector'. The training function returns a variable of from class 'ParallelANNModel', as described below.

## The 'ParallelANNModel' class

All information needed for the ANN is stored in the 'ParallelANNModel' class. The training function 'train' from 'ParallelANN' returns an object from the 'ParallelANNModel' class.

The information in 'parallelANNModel' consist of the weights, the number of input, hidden and output nodes, as well as two functions 'predictPoint' and 'predictPointV'.

The 'predictPoint' function is used to calculate a single output value as a 'Double'. If the output of the ANN actually is a vector, it returns just the first element of the vector, that is '$O_{0}$'. The output of the 'predictPointV' is of type 'Vector', and returns all '$K$' output values.

## The 'GeneralizedSteepestDescentAlgorithm' class

The 'GeneralizedSteepestDescendAlgorithm' class is based on the 'GeneralizedLinearAlgorithm' class. The main difference is that the 'GeneralizedSteepestDescentAlgorithm' is based on output values of type 'Vector', whereas 'GeneralizedLinearAlgorithm' is based of output values of type 'Double'. The new class was needed, because an ANN ideally outputs multiple values, hence a 'Vector'.

## Training

Science has provided many different strategies to train an ANN. Hence it is important that the optimising functions in MLLIB's ANN are interchangeable. The ParallelANN class has a variable 'optimizer', which is currently set to a 'GradientDescent' optimising class. The 'GradientDescent' optimising class implements a stochastic gradient descent method, and is also used for other optimisation technologies in Spark. It is expected that other optimising functions will be defined for Spark, and these can be stored in the 'optimizer' variable.

### Demo/test

Usage of MLLIB's ANN is demonstrated through the 'TestParallelANN' demo program. The program generates three functions:

* f2d: x -> y
* f3d: (x,y) -> z
* f4d: t -> (x,y,z)

When the program is given the Java argument 'graph', it will show a graphical representation of the target function and the latest values.

### Conclusion

The 'ParallelANN' class implements a Artificial Neural Network (ANN), using the stochastic gradient descent method. It takes as input an RDD of input/output values of type 'Vector', and returns an object of type 'ParallelANNModel' containing the parameters of the trained ANN. The 'ParallelANNModel' object can also be used to calculate results after training.

The training of an ANN can be interrupted and later continued, allowing intermediate inspection of the results.

A demo program for ANN is provided.