Neural Network Part 2: Perceptrons
July 22, 2014 1 Comment
I started working though the second chapter of McCaffrey’s book Neural Networks Using C# Succinctly to see if I could write the examples using F#.
McCaffrey’s code is tough to read though because of its emphasis on loops and global mutable variables. I read though his description and this is how <I think> the Perceptron should be constructed.
The inputs are a series of independent variables (in this case age and income) and the output is a single dependent variable (in this case party affiliation). The values have been encoded and normalized like in this post here.
An example of the input (from page 31 of his book) is:
Or in a more abstract manner:
In terms of data structures, individual inputs (each row) is placed into an array of floats and the output is a single float
I call this single set of inputs an “observation” (my words, not McCaffrey).
Looking at McCaffrey’s example for a perceptron Input-Output,
all of the variables you need are not included. Here is what you need:
Where A0 and B0 are the same as X0 and X1 respectively in his diagram. Also, McCaffrey uses the word “Perceptron” to mean two different concepts: the entire system as a whole and the individual calculation for a given list of X and Bias. I am a big believer of domain ubiquitous languages so I am calling the individual calculation a neuron.
Once you run these values through the neuron for the 1st observation, you might have to alter the Weights and Bias based on the (Y)result. Therefore, the data structure coming out of the Neuron is
These values are feed into the adjustment function to alter the weights and bias with the output as
I am calling this process of taking the a single observation, the xWeights, , and the bias and turning them into a series of weights and bais as a “cycle” (my words, not McCaffrey)
The output of a cycle is then fed with the next observation and the cycle repeats for as many observations as there are fed into the system.
I am calling the process of running a cycle for each observation in the input dataset a rotation (my words, not McCaffrey) and that the perceptron runs rotations for an x number of times to train itself.
Finally, the Perceptron takes a new set of observations where the Y is not known and runs a Rotation once to predict what the Y will be.
So with that mental image in place, the coding became much easier. Basically, there was a 1 to 1 correspondence of F# functions to each step laid out. I started with an individual cycle
- type cycleInput = {xValues:float list;yExpected:float;mutable weights:float list;mutable bias:float;alpha:float}
- let runNeuron (input:cycleInput) =
- let valuesAndWeights = input.xValues |> List.zip input.weights
- let output = valuesAndWeights
- |> List.map(fun (xValue, xWeight) -> xValue*xWeight)
- |> List.sumBy(fun x -> x)
- output + input.bias
- let runActivation input =
- if input < 0.0 then -1.0 else 1.0
I used record types all over the place in this code just so I could keep things straight in my head. McCaffrey uses ambiguously-named arrays and global variables. Although this makes my code a bit more wordy (esp for functional people), I think the increased readability is worth the trade-off.
In any event, with the Neuron and Activation calc out of the way, I created the functions that adjust the weights and bias:
- let calculateWeightAdjustment(xValue, xWeight, alpha, delta) =
- match delta > 0.0, xValue >= 0.0 with
- | true,true -> xWeight – (alpha * delta * xValue)
- | false,true -> xWeight + (alpha * delta * xValue)
- | true,false -> xWeight – (alpha * delta * xValue)
- | false,false -> xWeight + (alpha * delta * xValue)
- let calculateBiasAdjustment(bias, alpha, delta) =
- match delta > 0.0 with
- | true -> bias – (alpha * delta)
- | false -> bias + (alpha * delta)
This code is significantly different than the for, nested if that McCaffrey uses.
I maintain using this kind of pattern matching makes the intention much easier to comprehend. I also split out the adjustment of the weights and the adjustment of the bias into individual functions.
With these functions ready, I created an input and output record type and implemented the adjustment function
- let runAdjustment (input:adjustmentInput) =
- match input.yExpected = input.yActual with
- | true -> {weights=input.weights;bias=input.bias;yActual=input.yActual}
- | false ->
- let delta = input.yActual – input.yExpected
- let valuesAndWeights = input.xValues |> List.zip input.weights
- let weights' = valuesAndWeights |> List.map(fun (xValue, xWeight) -> calculateWeightAdjustment(xValue,xWeight,input.alpha,delta))
- let bias' = calculateBiasAdjustment(input.bias,input.alpha,delta)
- {weights=weights';bias=bias';yActual=input.yActual}
There is not a corresponding method in McCaffrey’s code, rather he just does some Array.copy and mutates the global variables in the Update method. I am not a fan of side-effect programming so I created a function that explicitly does the modification.
And to wrap up the individual cycle:
- let runCycle (cycleInput:cycleInput) =
- let neuronResult = runNeuron(cycleInput)
- let activationResult = runActivation(neuronResult)
- let adjustmentInput = {xValues=cycleInput.xValues;weights=cycleInput.weights;yExpected=cycleInput.yExpected;
- bias=cycleInput.bias;alpha=cycleInput.alpha;yActual=activationResult}
- runAdjustment(adjustmentInput)
Up next is to run the cycle for each of the observations (called a rotation)
- type observation = {xValues:float list;yExpected:float}
- type rotationInput = {observations: observation list;mutable weights:float list;mutable bias:float;alpha:float}
- type trainingRotationOutput = {weights:float list; bias:float}
- type predictionRotationOutput = {observation: observation;yActual:float}
- let runTrainingRotation(rotationInput: rotationInput)=
- for i=0 to rotationInput.observations.Length do
- let observation = rotationInput.observations.[i]
- let neuronInput = {cycleInput.xValues=observation.xValues;cycleInput.yExpected=observation.yExpected;cycleInput.weights=rotationInput.weights;
- cycleInput.bias=rotationInput.bias;cycleInput.alpha=rotationInput.alpha}
- let cycleOutput = runCycle(neuronInput)
- rotationInput.weights <- cycleOutput.weights
- rotationInput.bias <- cycleOutput.bias
- {weights=rotationInput.weights; bias=rotationInput.bias}
Again, note the liberal use of records to keep the inputs and outputs clear. I also created a prediction rotation that is designed to be run only once that does not alter the weights and bias.
- let runPredictionRotation(rotationInput: rotationInput)=
- let output = new System.Collections.Generic.List<predictionRotationOutput>()
- for i=0 to rotationInput.observations.Length do
- let observation = rotationInput.observations.[i]
- let neuronInput = {cycleInput.xValues=observation.xValues;cycleInput.yExpected=observation.yExpected;cycleInput.weights=rotationInput.weights;
- cycleInput.bias=rotationInput.bias;cycleInput.alpha=rotationInput.alpha}
- let cycleOutput = runCycle(neuronInput)
- let predictionRotationOutput = {observation=observation;yActual=cycleOutput.yActual}
- output.Add(predictionRotationOutput)
- output
With the rotations done, the last step was to create the Perceptron to train and then predict:
- type perceptronInput = {observations: observation list;weights:float list;bias:float}
- type perceptronOutput = {weights:float list; bias:float}
- let initializeWeights(xValues, randomSeedValue) =
- let lo = -0.01
- let hi = 0.01
- let xWeight = (hi-lo) * randomSeedValue + lo
- xValues |> List.map(fun w -> xWeight)
- let initializeBias(randomSeedValue) =
- let lo = -0.01
- let hi = 0.01
- (hi-lo) * randomSeedValue + lo
- let runTraining(perceptronInput: perceptronInput, maxEpoches:int) =
- let random = System.Random()
- let alpha = 0.001
- let baseObservation = perceptronInput.observations.[0]
- let mutable weights = initializeWeights(baseObservation.xValues,random.NextDouble())
- let mutable bias = initializeBias(random.NextDouble())
- let rotationList = [0..maxEpoches]
- for i=0 to maxEpoches do
- let rotationInput = {observations=perceptronInput.observations;weights=weights;bias=bias;alpha=alpha}
- let rotationOutput = runTrainingRotation(rotationInput)
- weights <- rotationOutput.weights
- bias <- rotationOutput.bias
- {weights=weights;bias=bias}
- let runPrediction(perceptronInput: perceptronInput, weights: float list, bias: float) =
- let random = System.Random()
- let alpha = 0.001
- let rotationInput = {observations=perceptronInput.observations;weights=weights;bias=bias;alpha=alpha}
- runPredictionRotation(rotationInput)
Before I go too much further, I have a big code smell. I am iterating and using the mutable keyword. I am not sure how to take the results of a function that is applied to the 1st element in a sequence and then input that into the second. I need to do that with the weights and bias data structures –> each time it is used in a expression, it need to change and feed into the next expression. I think the answer is the List.Reduce, so I am going to pick this up after looking at that in more detail. I also need to implement the shuffle method so that that cycles are not called in the same order across rotations….
Pingback: F# Weekly #30, 2014 | Sergey Tihon's Blog