The Task

Let's try multilayer neural networks.

First load matrix library version 3.3W. I have added a couple of things to make your life even easier.

Build a neural network in C/C++ with a single hidden layer as we saw in class. It should solve a problem that consists of n inputs and m outputs. You will build two similar programs called nn and nnoneof. The code for the two are very similar but just a tiny bit different in the output section. Please try to use the matrix library to make your life easier. The batch two layer neural network math in the two layer neural network handout are geared toward a matrix solution. And it is much easier for me to grade. WARNING: this assignment is harder than it looks.

The Training

Your programs will read in training data from standard input. You should then scale your input to be between 0 and 1. Your program can then train as much as you believe is needed. It will then read in test data. For this assignment use all the training data and then use the test data. Do not train on the test data! We will deal with cross validation later. The input file looks like:

#inputs #hiddenNodes #ofclasses
#rows #cols 
row1
row2
 ...
lastrow
#rows #cols 
row1
row2
 ...
lastrow
The training and test data each look like a matrix specification for the matrix library. First number is the number of features or inputs. Then the number of hidden nodes you should use. Then the number of classes for binary it is 2. For the iris problem there are 3 classes. All elements in a row after the first #inputs elements are the expected outputs in the training data. The #rows in the training matrix is greater than #inputs. The #rows in the second matrix is the same as #inputs. Your program will read in the training data and then train. (I used an eta of 0.1 and 10000 iterations.) Your network is a two layer network with the classic sigmoid transfer function: 1.0/(1.0 + exp(-4.0 * x)). Then it will read in the test data and use the trained network on the test data to derive the outputs.

This is some example input will test the two input logical function with three outputs (and, xor, and implication). You can see there are two inputs, three hidden nodes, a matrix for training and a matrix for testing.

2 3 2
4 5
0 0 0 0 1
0 1 0 1 1
1 0 0 1 0
1 1 1 0 1
4 5
0 0 0 0 1
0 1 0 1 1
1 0 0 1 0
1 1 1 0 1
In the case of the iris data sets you'll see the training is conveniently in "one of 3" form, while the testing is by species number. The test data testDataA2.tar shows more examples of the input format.

output stats

After training and printing out the results for the test data show the confusion matrix for the test run. Columns represent actual values and rows predicted values. There is an inc() function in the matrix library. Be sure to print out one confusion matrix for each output column in the test matrix. See results returned from the test machine.

Here are a few things unique to each of the two programs. These are required differences are only found in the final stage of processing for testing.

nn program

To compute the output matrix for the test data. Run the step function:

if (x>.5) return 1.0; else return 0.0;
on each element of output. Most of the test data for nn are binary logic functions. Then print out the resulting matrix. By popular demand, I have added a function in the matrix library called printFmt that gives you a formatted print for each matrix. I used that.

Then print the stats which consist of a confusion matrix for each column of output.

nnoneof program

The nnoneof program is assumes that there are K channels of output are mutually exclusive and that the desired value is the index of the channel with the maximum element. This is the argMax of every row. So the program is exactly the same as the nn program except that for the output is the index of the maximum element of the output vector. For example if the outputs for a given row of input are 1.414 2.718 3.14 1.618 then the answer is 2 which is the index of the 3rd element (zero based array). Therefore for the iris data it will print out the a single column of numbers that is the species number: 0, 1, or 2. Then the program will print the K X K confusion matrix.

This program will be used for the iris data in the test data in which a single class will be determined by picking the largest of the outputs which each output representing whether a class has been selected. One could have used softmax which is also common for this kind of choice but we will not.

I encourage you to use my random number generator and matrix objects. Don't forget to call initRand() before using anything with random numbers. If you do, then include them in your tar file! Do not use any other prepackaged software without asking.

Your code must compile and run on the class unix machine. If it does not compile or fails to run (e.g. gets seg faults) or in other ways produces no output that is a very serious fault and will result in a very poor score. In 4xx/5xx CS classes your code should at least run and produce reasonable output.

makefile

Your makefile must take two targets: nn and nnoneof. Meaning I can run the command make nn or make nnoneof and it will make the corresponding executable. Here is a sample example makefile:

BIN  = nn

CXX=g++
SHELL=/bin/sh

# optimization flags
CXXFLAGS=-O3 -Wall
# debugging flags
CXXFLAGS=-g -Wall
LIBS=

SRCS=\
nn.cpp\
mat.cpp\
randf.cpp

HDRS=\
mat.h\
rand.h

OBJS=\
mat.o\
rand.o

$(BIN): $(OBJS) $(BIN).o
        $(CXX)  $(CFLAGS) $(OBJS) $(BIN).o $(LIBS) -o $(BIN)

$(BIN)oneof: $(OBJS) $(BIN)oneof.o
             $(CXX)  $(CFLAGS) $(OBJS) $(BIN)oneof.o $(LIBS) -o $(BIN)oneof

clean:
	/bin/rm -f *.o a.out

Grading will be based on matching my output. Since these programs are stochastic I will look that the format of your output matches and the values seem reasonable. Test suite will do a side by side comparison using the UNIX sdiff tool. See information on the class page about testing.

Submission

Tar up all the code necessary along with a makefile to build the program named nn that reads the sample data from stdin as described above. Homework should be submitted as an uncompressed tar file to the homework submission page linked from the class web page. You can submit as many times as you like. The LAST file you submit BEFORE the deadline will be the one graded. For all submissions you will receive email at your uidaho.edu mail address giving you some automated feedback on the unpacking and compiling and running of code and possibly some other things that can be autotested.

Have fun.