| CS 5541: Artificial Intelligence | Fall 2003 |
One of the main problems when using neural networks (a.k.a., connectionist models) is that of defining the weights for the network. In this program, you will use the delta rule (as covered in class) to train a neural network to find solutions for various problems. For this problem, you need to program the solution in the Scheme language.
Your networks will have a single input layer, a single weight layer, and a single output layer, and will be fully connected with weights. Connections will only be feedforward (no recurrent connections).
You will write a function create-neural-net that accepts two parameters: the number of input units that will be used with the neural network, and the number of output units that will be used with the neural netowork. This create-neural-net function will be written in an object-oriented style that will be described below. The neural net that is returned by this function will actually be a function (procedure) object, to which various methods can be applied-- you will need to implement at least two methods that apply to the neural net objects: train and test. Below are two examples of using my neural network:
Training example 1
(define N (create-neural-net 2 1)) (define train-data '( (#(-1 -1) #(0)) (#(-1 1) #(1)) (#(1 -1) #(1)) (#(1 1) #(1)) ) ) (N 'test train-data) (N 'train train-data) (N 'test train-data)Results in the following output:
Network outputs before training: #(0.11920292202211755) #(1/2) #(1/2) #(0.8807970779778823) Average sum squared error before training: 0.13210466830930553 Trained network for 1000 epochs. Network outputs after training: #(0.01612309163838964) #(0.4838939964137099) #(0.5161060035862901) #(0.9838769083616103) Average sum squared error after training: 0.12525967871775076
Training example 2
(define N (create-neural-net 4 2)) (define train-data '( (#(1 0 0 0) #(0 1)) (#(0 1 0 0) #(0 1)) (#(0 0 1 0) #(1 0)) (#(0 0 0 1) #(1 0)) ) ) (N 'test train-data) (N 'train train-data) (N 'test train-data)Results in the following output:
Network outputs before training: #(0.7310585786300049 0.7310585786300049) #(0.7310585786300049 0.7310585786300049) #(0.7310585786300049 0.7310585786300049) #(0.7310585786300049 0.7310585786300049) Average sum squared error before training: 0.6067761335170363 Trained network for 1000 epochs. Network outputs after training: #(0.034080451841329365 0.966593407668862) #(0.034080451841329365 0.966593407668862) #(0.966593407668862 0.034080451841329365) #(0.966593407668862 0.034080451841329365) Average sum squared error after training: 0.002277477608888016
Your training data should be structured as a list of pairs of Scheme vectors. For example, in the first training example above, the training data was defined as:
(define train-data '( (#(-1 -1) #(0)) (#(-1 1) #(1)) (#(1 -1) #(1)) (#(1 1) #(1)) ) )
This training data set comprises four examples (the train-data list has four elements). Each example comprises a pair: a vector with two elements (the network inputs), and a vector with a single element (the desired network outputs). This number of inputs and outputs was used because the network in example 1 was created with 2 inputs and 1 output using the function call: (create-neural-net 2 1).
Object-Oriented Scheme
You should not need to use global variables in your programs other than the neural-net variable, N. All the data for the program can be accessed from this variable, N. The following technique can be used to allow N to be called as a function:
(define create-object (lambda (parameter1 parameter2 etc) ;; the trick here is to define some local data members for the object: (define data-member1 'a) (define data-member2 'b) ;; etc ;; Then, define a function within this function. ;; The following function is the result of calling create-object. (lambda (method data) ;; method is the method to invoke & data is the data to pass (cond [(equal? method 'method1) (set! data-member1 data) ;; change the data member, data-member1 ] [(equal? method 'method2) (display data-member1) (newline) ] [else (newline) (display "ERROR: Bad operation on object: ") (display op) (newline) ] ) ) ) ) (define Obj (create-object 1 2 3)) (Obj 'method2 'dummy) ;; outputs 'a' (Obj 'method1 100) (Obj 'method2 'dummy) ;; outputs '100'You should define your neural network, and any parameters for the neural net (e.g., learning rate) as data members within the create-neural-network function. You will also need to write a variety of auxillary (helper) functions, which are invoked when you invoke methods of a neural net object.
Vectors in Scheme
You should use vectors in Scheme to represent the inputs and outputs (e.g., training data) and weights of your network. Vectors are one-dimensional arrays. Vector constants are denoted with lists that begin with a '#' character. For example, '#(1 2) is a length 2 vector with components 1 and 2. There are various useful functions associated with vectors. For example:
(make-vector Length)
Creates a vector with the specified length.
(vector-ref V K)
Returns element K of vector V, and is like V[K] in C or C++. Vectors are indexed starting at element 0.
For example: (vector-ref '#(a b) 1) returns b.
(vector-set! V K Obj)
Replaces element K in vector V with Obj. This is like V[K] = Obj in C or C++.
Computing Outputs
In order to compute the outputs of a neural network, you need a vector of inputs, and a vector of weights. Of course, you also need to know the architecture of the network. In our case, the architecture will involve a single layer of weights (one layer of inputs, one layer of outputs), and so you just need to know the number of inputs and the number of outputs to specify the architecture.
Outputs will be computed, as discussed in class, using sumed weighted inputs, and the sigmoidal activation function. Use sigma = 1 for your sigmoidal activation function.
Sum Squared Error
You should compute the average sum squared error of your network at least before training and after training. I suggest training your network for 1000 epochs (cycles through all of the data), though you may find the average sum squared error stops changing quite a few epochs prior to 1000.
To compute the average sum squared error (SSE), compute the SSE through your network for each sample, and average the SSE over all training samples. In the following equation, the network has p output units.
Delta Rule
Train your networks using the delta rule to update weights. A learning rate of 0.5 should be reasonable. The delta rule is given by:
where f'(x) is the first derivative of the sigmoidal. f'(x) = s(x)(1 - s(x)) where s(x) is the sigmoidal of x.
Submission: By the due date, you need to turn in hard copy (paper) of your program code, and hard copy of your testing which should include at least the two training examples above. You should also email your program file(s) to the course TA (Prashant Jain).