Hi ,
I was designing neural network code for Unicode Optical character recognition using artificial neural network (Unicode Optical Character Recognition - CodeProject) and I am not getting it run correctly.
I started writing code from from scratch and got stuck in network code.
Network.java
package ocr;//Unicode Optical character recognition using artificial neural network import java.lang.Math; import java.awt.*; import java.awt.image.*; import javax.swing.*; import java.io.*; import javax.imageio.ImageIO; class Network{ /* Learning rate = 150 Sigmoid Slope = 0.014 Weight bias = 30 (determined by trial and error) Number of Epochs = 300-600 (depending on the complexity of the font types) Mean error threshold value = 0.0002 (determined by trial and error) */ public int[] input1 = new int[150];//Input of layer 1 (from other module)(input layer) public double[] input2 = new double[250];//Input of layer 2(one hidden layer) public double[] output2 = new double[250];//output of layer 2 public double[] input3 = new double[16];//Input of layer 3 public double[] output3 = new double[16];//Output of layer 3(output layer) public double[] required = new double[16];//Required output public static double[][] edge1 = new double[150][250];//edges from input layer nodes to hidden layer public static double[][] edge2 = new double[250][16];//edges from hidden layer to output layer public double[] error2 = new double[16];//To calculate back propagation error at output layer public double[] error1 = new double[250];//To calculate back propagation error at middle(1 hidden) layer public static double error = 0.0;//To calculate total error to compare with threshold error java.util.List<CharMatrixs> charMatrix; //weight initialize Network(java.util.List<CharMatrixs> charMatrix){ for(int i= 0;i<150;i++) for(int j=0;j<250;j++) edge1[i][j]=Math.random()*500;//initialize weights of edge1 for(int i=0;i<250;i++) for(int j=0;j<16;j++) edge2[i][j]=Math.random()*500;//initialize weights of edge2 this.charMatrix = charMatrix; } public void test(){//Start training or Testing of network do{ for(int charIndex=0;charIndex<Char.charNum;charIndex++){//Runs for number of characters read from image for each character int index=0; for(int row=0;row<15;row++) for(int column=0;column<10;column++) input1[index++]=charMatrix.get(charIndex).matrix[row][column];//linearize the 2-D matrix into single array for input layer //required Unicode reqCharOP = new Unicode(); java.util.ArrayList<Required> req_OP = reqCharOP.reqOP; required = req_OP.get(charIndex).required_op;//required = 16 bit of read character at charIndex for(int j=0;j<250;j++){ input2[j]=0; for(int i=0;i<150;i++){ input2[j] = input2[j]+input1[i]*edge1[i][j]+30;//Calculating Input for layer 2 for node j } output2[j]=2/(1+(Math.pow(Math.E,(-0.014*input2[j]))))-1;//calculating output of hidden layer node j(layer 2) } for(int j=0;j<16;j++){ input3[j]=0; for(int i=0;i<250;i++){ input3[j] = input3[j]+output2[i]*edge2[i][j]+30;//Calculating Input for layer 3 for node j } output3[j]=2/(1+(Math.pow(Math.E,(-0.014*input3[j]))))-1;//calculating output of layer 3 (output layer) node j } //error Computation for(int i=0;i<16;i++){ error2[i]=output3[i]*(1-output3[i])*(required[i]-output3[i]);//error at output layer } /* Here is where I'm getting problem output of output layer nodes is always coming 1 or value very close to one so error2 for all nodes is coming 0. So weights are also not upadating as error2 is 0 and thus network is not working. Help !! */ double temp = 0.0; for(int i=0;i<250;i++){ for(int j=0;j<16;j++) temp = temp+edge2[i][j]*error2[j]; error1[i]=output2[i]*(1-output2[i])*temp;//To calculate error at middle , 1 hidden layer } //weight updation for(int j=0;j<250;j++) for(int i=0;i<150;i++){ edge1[i][j]=edge1[i][j]+150*error1[j]*input1[i];//weight updation for edges input to hidden layer } for(int j=0;j<16;j++) for(int i=0;i<250;i++) edge2[i][j]=edge2[i][j]+150*error2[j]*output2[i];//weight updation for edges hidden to output layer for(int i=0;i<16;i++) { if(error2[i]>0) error = error + error2[i]; else error = error - error2[i]; } } if(error<0) error *=-1; }while(error>0.0002);//Repeat until Total error is > minimum threshold value } }
Here all the network attributes of network
Learning rate = 150
Sigmoid Slope = 0.014
Weight bias = 30 (determined by trial and error)
Number of Epochs = 300-600 (depending on the complexity of the font types)
Mean error threshold value = 0.0002 (determined by trial and error)
are taken from Unicode Optical Character Recognition - CodeProject.
Whats Happening is Output of the network nodes is coming 1 or value very close to 1 so the error always reaches to 0.
I'm not able to get around the problem how to take initial weight values so that network gets trained.
Also may be the network parameters have to be changed.
Also I'm writing neural network code first time so have no idea whether its right way to implement network or not.
I'm basically using theory from Unicode Optical Character Recognition - CodeProject and writing code in Java and till far it has worked.
Any suggestion and Help will be appreciated.
Other Classes:
CharMatrixs.java
package ocr; import java.awt.*; import java.awt.image.*; import javax.swing.*; import java.io.*; import javax.imageio.ImageIO; class CharMatrixs{ public int[][] matrix = new int[15][10];//Input of layer 1 , 150 nodes }
Unicode.java
package ocr; //Training image has 89 characters //To get unicode of each character in that sequence class Unicode{ java.util.ArrayList<Required> reqOP = new java.util.ArrayList<Required>();//Arraylist for each character's 16 bit unicode int[] arr = {65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,48,49,50,51,52,53,54,55,56,57,127,33,35,36,37,94,38,42,40,41,45,43,91,93,123,125,92,59,39,58,44,46,47,60,62,63,64}; Unicode(){ for(int i=0;i<arr.length;i++){ Required req = new Required(); int j; for(j=15;j>=0;j--){ if(arr[i]%2==0) req.required_op[j]=0.0; else req.required_op[j]=1.1; arr[i]/=2; if(arr[i]==0) break; } for(--j;j>=0;j--) req.required_op[j]=0.0; reqOP.add(req); } } }
Required.java
package ocr; class Required{ double[] required_op = new double[16]; }