-
Notifications
You must be signed in to change notification settings - Fork 1
1. Introduction to Neural Networks and this library
Experimental Neural Networks (eNN) is a C++ system that trains and runs Feed Forward, Back propagation Neural Networks. It is for experimenting with neural networks and is not “experimental” itself in any way. It was originally written for the Be Operating System but translated to Windows (MFC) when my BeOS partition got nuked by Bill Gates (ironic, I know!).
The current version is written using the C++ standard library to get away from the quirks of any one package. It is single threaded and has only been tested on the Raspberry Pi.
Neural Networks are a simulation of the way the brain works. The take a set of input variables, pass them to a set of input “nodes” (collectively know as the “input layer”) and then through a set of links to another set of nodes (known as “hidden nodes” in a “hidden layer”) where a new set of intermediate values are calculated. When these are calculated these new values are passed through another set of links to a final set of nodes, “output nodes” in the “output layer” where the final value is calculated.
Each node is linked to all nodes in the next layer. Each link from one node to the next applies a predefined value (a “weight”) to the incoming value before passing it on. To calculate the new node value in the hidden and output layers, the incoming values from the links are summed and the sum is multiplied by another predefined value, a “bias”. This “biased sum” is then passed through a “transition function” to calculate the new node value. This is a “Feed Forward” neural network (also known as a “Multi-Layer Perceptron”).
The weights and biases are determined (or “trained”) by a general training algorithm called “Back propagation”. The untrained network starts with a random set of weights and biases and a set of known examples are passed to the training algorithm. Each example contains the input values and the desired output. The weights and biases are adjusted so that the generated output better matches the desired output. The network is “nudged” towards a better set of weights and biases in small steps. The size of these steps is influenced by a designated value called the “learning rate parameter”.
For more information on all types of neural networks see: [Hagan, Demuth and Beal, “Neural Network Design” 1996] (if you can get it). For more information on learning and network design see: [Reed and Marks, “Neural Smithing” 1999]. Many other books cover this level of neural networks.== This Version's Limitations ==
- Three layers (input, hidden and output)
- Log Sigmoid is the only available transition function
- No momentum term in training
- One constant learning rate for all layers
- Unary Bias Nodes are available for the input layer only
The procedure for network training is as follows:
- Decide on what you want to input to the network. The input variables must be orthogonal (i.e. they must vary independently[1]). The randomisation this version expects the input to be binary (0, 1) but the training algorithm will accept any value that is not to far away from 0 (approximately -3 to 3). If the input values are too big the network will not move toward a solution.
- Decide how you want the output to be presented. This is a combination of how many output variables there are (ie. How many output nodes) and what the value of each node represents. This could be binary (0, 1), bipolar (-1, 1) or uniform (-a..a) (i.e. a value between -a and a).
- Decide how many hidden nodes you think are needed. There is no hard and fast rule for the number of hidden nodes. Start by using the same number as the wider of the input layer or output layer. Try increasing the number if your network does not seem to be learning your data well.
- Collect a large set of data that represents all situations that you want your network to classify.
- For each row in this data set determine what output you want from your network.
- Decide how close you want your network's output to be to your desired output. (e.g. “I'll be happy when the network gives me a value less than 0.1 from the desired output). You may need to review this decision when you see the performance of your network.
- Massage the input and desired output into the Training File Format as described in the Wiki document of that name.
- [#definingANetwork|region]efind your network (i.e. how many input, hidden and output nodes).
- Apply your training set(s) to the network
- Test the network against a unseen data set to see if the network is general enough. A network is general when it matches an unseen data set as well as its training set. If the unseen data set tests significantly worse than the training set then it is too specific. If it is too specific:
- Was the training set too large? (If so try training the network in smaller chunks.)
- Did the training set represent the data set well (run some statistical analysis over the data set to check this and if so find more representative data)?
- Regress to an earlier version and see if performs better
- Try a different network topology
- Try a different number of hidden nodes
- Add a unary bias node to the input layer
- Try a different set of input variables
- [#randomiseNetwork|region]andomise your network and, taking into account what you have learned so far, start again from 9 above
- [#loadSave|region]ave your network.
- Decide if the network performs as per the criteria in point 4 above. If not return to 9 (having generated another training set if possible).
- [#runInputFile|region]un your network with real, unseen data.
NOTE: This is a very brief description of a very complex process. There are numerous books that cover this topic (e.g. [Reed, Marks 99]).
- ^ If you have two factors that are important but are related combine them into a single variable. E.g. If the number of hours of sunshine and the amount of rain are needed, numerically combine them into a “weather” variable such as 1 representing completely sunny and -1 representing soggy wet.