Please enable javascript to view my page

The goal of this assignment is to design and implement a program that recognizes people's handwritting.

We will implement this using Neural Networks. We will be using Matlab. Matlab has an extensive neural network toolbox that largely removes the need to build the nuts and bolts components of neural networks

This assignment is divided into three parts. The first part is an introduction to neural nets in Matlab. Part two uses neural nets to do computer character recognition with and without noise. And part three attempts to use neural nets to do human written character recognition

Also in the downloads section you can find a .zip file with all the source code

Part 1

Source Code

p1 = [-1 2 -1 2; 2 2 5 5];
t1 = [-1 -1 1 1];

p2 = [2 0 -2 -4; 0 1 2 3];
t2 = [-1 -1 1 1];

p3 = [1 0;5 5];
t3 = [-1 1];

p4 = [1 0;0 0];
t4 = [-1 1];

net =newff(minmax(p1),[3,1],{'tansig','purelin'},'traingd');

net.trainParam.lr = .05;        %Learning Rate
net.trainParam.epochs = 300;    %Max Ephocs
net.trainParam.goal = 1e-5;     %Training Goal in Mean Sqared Error
net.trainParam.show = 50;       %# of ephocs in display

[net,tr1] = train(net,p1,t1);
o1 = sim(net,p1)

net = newff(minmax(p2),[3,1],{'tansig','purelin'},'traingd');
[net,tr2] = train(net,p2,t2);
o2 = sim(net,p2);

net = newff(minmax(p3),[3,1],{'tansig','purelin'},'traingd');
[net,tr3] = train(net,p3,t3);
o3 = sim(net,p3);

net = newff(minmax(p4),[3,1],{'tansig','purelin'},'traingd');
[net,tr4] = train(net,p4,t4);
o4 = sim(net,p4);
		

Output Images

Output 1

Output 2

Output 3

Output 4

Analysis

By switching the symmetry of the inputs, the signs of the outputs will also change. For example, in the training vector the second and third numbers are -1 and 2 respectively, while in p1 they are 2 and -1. the reversal of signs causes the second output to be negative. Also, if you look at the o and o1 output vectors, you will see that -1.0080 repeats itself. This is because the third and fourth numbers are now -1 and 2. However, it all depends on how the nn is trained in the end.

Part 2

Source Code

[alphabet,targets] = prprob;
net = newff(alphabet,targets,25);
net1 = net;
net1.divideFcn = '';
[net1,tr] = train(net1,alphabet,targets);
numNoisy = 10;
alphabet2 = [alphabet repmat(alphabet,1,numNoisy)+randn(35,26*numNoisy)*0.2];
targets2 = [targets repmat(targets,1,numNoisy)];
net2 = train(net,alphabet2,targets2);
noise_range = 0:.05:.5;
max_test = 100;
network1 = [];
network2 = [];

for i = 0:4
    sumerr = 0;
    figure
    noisyR = alphabet(:,18)+randn(35,1) * 0.2*i;
    plotchar(noisyR);
    A2 = sim(net2,noisyR);
    A2 = compet(A2);
    answer = find(compet(A2) == 1);
    figure
    plotchar(alphabet(:,answer));
    for j = 1:35
        sumerr = sumerr + (alphabet(j,18)-alphabet(j,answer));
    end
    mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "R"')
xlabel('Noise Level')
ylabel('Mean Squared Error')

% %Stops recognizing R usually around .6 noise level.  Then it recognizes the
% %R as either a P, a B, or a K, all of which are similar in structure to the
% %R.  Based on the results of the original network tests, I would expect a
% %higher percentage of error than has been shown.  The original noise
% %trained neural network at just .4 noise level experienced a 60% error
% %rate.  Yet the network that we're using seems to perform much more
% %reliably at .4,

for i = 0:4
    figure
    noisyA = alphabet(:,1)+randn(35,1) * 0.2*i;
    plotchar(noisyA);
    A2 = sim(net2,noisyA);
    A2 = compet(A2);
    answer = find(compet(A2) == 1);
    figure
    plotchar(alphabet(:,answer));
    for j = 1:35
        sumerr = sumerr + (alphabet(j,1)-alphabet(j,answer));
    end
    mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "A"')
xlabel('Noise Level')
ylabel('Mean Squared Error')

for i = 0:4
    figure
    noisyO = alphabet(:,15)+randn(35,1) * 0.2*i;
    plotchar(noisyO);
    A2 = sim(net2,noisyO);
    A2 = compet(A2);
    answer = find(compet(A2) == 1);
    figure
    plotchar(alphabet(:,answer));
    for j = 1:35
        sumerr = sumerr + (alphabet(j,15)-alphabet(j,answer));
    end
    mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "O"')
xlabel('Noise Level')
ylabel('Mean Squared Error')

for i = 0:4
    figure
    noisyZ = alphabet(:,26)+randn(35,1) * 0.2*i;
    plotchar(noisyZ);
    A2 = sim(net2,noisyZ);
    A2 = compet(A2);
    answer = find(compet(A2) == 1);
    figure
    plotchar(alphabet(:,answer));
    for j = 1:35
        sumerr = sumerr + (alphabet(j,26)-alphabet(j,answer));
    end
    mserr(i+1) = (sumerr/35)^2;
end
x = 0:.2:.8;
plot(x,mserr)
title('Mean Squared Error for Noisy "Z"')
xlabel('Noise Level')
ylabel('Mean Squared Error')

%the noise level definitely affects the more distinctive letters less.
%Forexample, A and Z are both easily detected by the computer even at noise
%level .6 because they have very distinctive shapes that are not shared by
%other letters.  However, the O which is very similar to letters like C and
%G is much harder to distinguish at high noise levels.

eight = [0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0;1;0;0;0;1;1;0;0;0;1;0;1;1;1;0];
for i = 0:4
    figure
    noisy8 = eight+randn(35,1) * 0.2*i;
    plotchar(noisy8);
    A2 = sim(net2,noisy8);
    A2 = compet(A2);
    answer = find(compet(A2) == 1);
    figure
    plotchar(alphabet(:,answer));
end

%When an input that the NN wasn't trained on is loaded, the neural network
%will of course not be able to tell what it is.  However, it should be able
%to tell you which of the letters it is close to.  This means that if you
%input an 8, you would expect to get back a B, since they have only four
%squares of difference between them.  In test, though, the network had a
%tendency to relate the 8 to a P.
		

Output Images

Correlation

Letter R

Stops recognizing R usually around .6 noise level. Then it recognizes the R as either a P, a B, or a K, all of which are similar in structure to the R. Based on the results of the original network tests, I would expect a higher percentage of error than has been shown. The original noise trained neural network at just .4 noise level experienced a 60% error rate. Yet the network that we're using seems to perform much more reliably at .4,

Letter A

Letter O

Letter Z

The noise level definitely affects the more distinctive letters less. Forexample, A and Z are both easily detected by the computer even at noise level .6 because they have very distinctive shapes that are not shared by other letters. However, the O which is very similar to letters like C and G is much harder to distinguish at high noise levels.

Number 8

When an input that the NN wasn't trained on is loaded, the neural network will of course not be able to tell what it is. However, it should be able to tell you which of the letters it is close to. This means that if you input an 8, you would expect to get back a B, since they have only four squares of difference between them. In test, though, the network had a tendency to relate the 8 to a P.

Part 3

Source Code

clear all
MNIST = load('MNIST_data.mat');

train_samp = MNIST.train_samples';
temp = MNIST.train_samples_labels';
for ii=1:4000
    train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0];
    train_labl(temp(ii)+1,ii) = 10;
end

net = newff(train_samp,train_labl,[784,196,49,24],...
            {'logsig','logsig','logsig','logsig'},'trainrp');
net.divideFcn = 'dividerand';
net.trainParam.lr = .25;
net.trainParam.epochs = 300;
net.trainParam.goal = 0;
net.trainParam.show = 50;
net = train(net,train_samp,train_labl);

confusionmat = zeros(10,10);
for ii=1:1000
    result = sim(net,MNIST.test_samples(ii,:)');
    label = MNIST.test_samples_labels(ii);

    index = find(result==max(result));

    confusionmat(index,label+1) = confusionmat(index,label+1) + 1;
end
confusionmat

sum = 0;
for ii = 1:10
    sum = sum + confusionmat(ii,ii);
end
errorpct = (1-(sum/1000)) * 100

confusionmat =

    74     0     0     1     0     2     2     0     0     1
     0   120     2     0     0     1     0     6     0     1
     0     0    85     1     1     0     1     2     3     1
     0     1     2    93     2     3     0     0     7     1
     2     0     1     2    87     1     1     2     6    16
     7     0     1     5     4    74     3     1     3     2
     1     0     2     1     2     2    76     0     3     0
     0     0     2     4     0     0     0    74     0     4
     0     1    13     4     5     9     3     1    53     1
     2     0     5     4     7     0     1    13    11    65


errorpct =

   19.9000

Analysis

We implemented a 4 layer net. Our performance was always between 10% and 20%. We used mostly the 'logsig' function for our neurons because it yielded better performance.

The training process was slow. Taking about 20 minutes per run. We then setup the divideFcn parameter to 'dividerand', this cut down the training time significantly by stopping training when results started getting worst. We tried representing the images as weighted sums, weighted averages, weighted sums by row and column. All of these methods failed, none of them produced better than a 90% error.

Extra Credit

Scanned Image

Scanned Document

This is the scanned image of sample handwritting. The image was cropped by number and inverted. Here is an example Number:

Sample

Code

clear all

test(1,:,:,:) = imread('samples/0-0','jpg');
test(2,:,:,:) = imread('samples/0-1','jpg');
test(3,:,:,:) = imread('samples/0-2','jpg');
test(4,:,:,:) = imread('samples/0-3','jpg');
test(5,:,:,:) = imread('samples/0-4','jpg');
test(6,:,:,:) = imread('samples/0-5','jpg');
test(7,:,:,:) = imread('samples/0-6','jpg');
test(8,:,:,:) = imread('samples/0-7','jpg');
test(9,:,:,:) = imread('samples/0-8','jpg');
test(10,:,:,:) = imread('samples/0-9','jpg');
test(11,:,:,:) = imread('samples/1-0','jpg');
test(12,:,:,:) = imread('samples/1-1','jpg');
test(13,:,:,:) = imread('samples/1-2','jpg');
test(14,:,:,:) = imread('samples/1-3','jpg');
test(15,:,:,:) = imread('samples/1-4','jpg');
test(16,:,:,:) = imread('samples/1-5','jpg');
test(17,:,:,:) = imread('samples/1-6','jpg');
test(18,:,:,:) = imread('samples/1-7','jpg');
test(19,:,:,:) = imread('samples/1-8','jpg');
test(20,:,:,:) = imread('samples/1-9','jpg');
test(21,:,:,:) = imread('samples/2-0','jpg');
test(22,:,:,:) = imread('samples/2-1','jpg');
test(23,:,:,:) = imread('samples/2-2','jpg');
test(24,:,:,:) = imread('samples/2-3','jpg');
test(25,:,:,:) = imread('samples/2-4','jpg');
test(26,:,:,:) = imread('samples/2-5','jpg');
test(27,:,:,:) = imread('samples/2-6','jpg');
test(28,:,:,:) = imread('samples/2-7','jpg');
test(29,:,:,:) = imread('samples/2-8','jpg');
test(30,:,:,:) = imread('samples/2-9','jpg');

for ii=1:30
    imgs(ii,:,:) = test(ii,:,:,1);
    labels(ii) = mod((ii-1),10);
end

for ii=1:30
    vecs(:,ii) = reshape(imgs(ii,:,:),784,1);
    samples(:,ii) = double(vecs(:,ii))./255.0;
    for jj=1:784
        if samples(jj,ii) > .15
            samples(jj,ii) = samples(jj,ii) * 10;
        elseif samples(jj,ii) < .08
            samples(jj,ii) = samples(jj,ii) * .5;
        end

        samples(jj,ii) = samples(jj,ii) / 10;
    end
end


MNIST = load('MNIST_data.mat');

train_samp = MNIST.train_samples';
temp = MNIST.train_samples_labels';
for ii=1:4000
    train_labl(:,ii) = [0;0;0;0;0;0;0;0;0;0];
    train_labl(temp(ii)+1,ii) = 10;
end

net = newff(train_samp,train_labl,[784,196,49,24],...
            {'logsig','logsig','logsig','logsig',},'trainrp');
net.divideFcn = 'dividerand';
net.trainParam.lr = .25;
net.trainParam.epochs = 300;
net.trainParam.goal = 1e-5;
net.trainParam.show = 50;
net = train(net,train_samp,train_labl);

confusionmat = zeros(10,10);
for ii=1:30
    result = sim(net,samples(:,ii));
    label = labels(ii);

    index = find(result==max(result));

    confusionmat(index,label+1) = confusionmat(index,label+1) + 1;
end
confusionmat

sum = 0;
for ii = 1:10
    sum = sum + confusionmat(ii,ii);
end
errorpct = (1-(sum/30)) * 100

confusionmat =

     0     0     0     0     0     0     0     0     0     0
     1     2     1     2     1     0     0     0     1     0
     0     0     0     0     0     0     0     1     0     0
     0     0     0     0     0     0     2     1     0     1
     2     1     2     1     2     2     1     1     2     2
     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     1     0     0     0     0
     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0


errorpct =

   86.6667

Analysis

When we ran the simulation with the scanned numbers. Our neural net got confused. We think it is because the test sample size is too small(30) to get any accurate data. Other things that affected the accuracy of our Neural Net are the quality of the scan, and thenormalization procedures, might not have been the same that were used for the training images.

Downloads