-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convolutional network, annealing and epochs #135
Comments
|
I have an array of doubles that comes from processed features of subjects. right now im trying to run this code but the training accuracy is off. SgdTrainer Tr = new SgdTrainer(net)
{
LearningRate = 0.01,
BatchSize = 500,
L2Decay = 0.001,
Momentum = 0.9
};
net.AddLayer(new InputLayer(28, 28, 1));
net.AddLayer(new ConvLayer(5, 5, 8) { Stride = 1, Pad = 2 });
net.AddLayer(new ReluLayer());
net.AddLayer(new PoolLayer(2, 2) { Stride = 2 });
net.AddLayer(new ConvLayer(5, 5, 16) { Stride = 1, Pad = 2 });
net.AddLayer(new ReluLayer());
net.AddLayer(new PoolLayer(3, 3) { Stride = 3 });
net.AddLayer(new FullyConnLayer(10));
net.AddLayer(new SoftmaxLayer(10));
double[] d = new double[12 * 63];
for (int k = 0; k < 10; k++)
{
int count = 0;
for (int i = 0; i < 12; i++)
{
for (int j = 0; j < UserDATA[k].GetLength(1); j++)
{
d[count] = UserDATA[k][i, j];
count++;
}
}
var x = BuilderInstance.Volume.From(d, new Shape(12, 63, 1));
double[] z = new double[10];
for (int t = 0; t < z.Length; t++)
{
z[t] = 0.0;
}
z[k] = 1.0;
var zx = BuilderInstance.Volume.From(z, new Shape(1, 1, 10, 1));
for (int g = 0; g < Convert.ToInt32(NumberOfTrainingSteps.Text); g++)
{
Tr.Train(x, zx); // train the network, specifying that x is class zero
}
}
double[] ts = new double[12 * 63];
double[] testd = new double[12 * 63];
for (int k = 0; k < 10; k++)
{
int count = 0;
for (int i = 0; i < 12; i++)
{
for (int j = 0; j < UserDATA[k].GetLength(1); j++)
{
testd[count] = UserDATA[k][i, j];
if (k == 0)
ts[count] = UserDATA[k][i, j];
count++;
}
}
var x = BuilderInstance.Volume.From(testd, new Shape(12, 63, 1));
var prob = net.Forward(x);
TestCON.Text += "\r\n" + " " + k + " " + prob.Get(k);
TestCON.Text += "\r\n" + k + " cl 0 prob " + prob.Get(0);
} it seems that NumberOfTrainingSteps does not give me any increase in accuracy. but that can be expected because im not feeding any new data to the network. Thing is, even if i do train it on other examples nothing changes. Also, what is the BatchSize in trainer responsible for? Also, as i understand the input layer size should correspond with the amount of data points i feed to the network i.e. 28x28x1 should take no more than 784 data points? |
|
Can you tell me what learningRate, L2Decay, Momentum represent? Also, is it possible to use the same data samples to train the network? Do you have functions that mutate the weights(simulated annealing, freezing, evolution multidimensional optimisation) or functions that separate epochs in training of the network? Also, if i use different trainers to train the network on the same data samples will it change anything performance wise? |
The learning rate determines the size of the steps we take to reach a (local) minimum. Basically the gradients are multiplied by the learning rate before being used to update the parameters to optimize. (see here in the code) L1Decay and L2Decay are supposed to be used for regularization. You've made me realize that I still haven't implemented them. So these parameters are useless. I will get rid of them in the meanwhile. Momentum is a method that helps accelerate SGD. You can look at section 4.1 of https://arxiv.org/pdf/1609.04747.pdf. (see here in the code) The functions that mutate weights are called Trainers in ConvNetSharp. SgdTrainer / AdamTrainer for ConvNetSharp.Core and SgdTrainer / AdamTrainer for ConverNetSharp.Flow Using different trainers will impact the performance of the network: Some training algorithms are more adapted to some kind of tasks. I am not sure I understand "functions that separate epochs in training of the network". If it's a function to split data set in training / testing / validating, there is no such function in this library. |
Thank you for information! I have another question: im testing network after every training step and having problems with probability output. 100 different test samples outputs the same probability. As i understand the output should be different with every new test sample. What may cause that? It seems to me that network forgets previous training data or i simply can't see errors in my code. Here is the code: Net net = new Net();
|
Does the loss decrease? |
The input shape you use for testing seems odd: new Shape (999 * 705, 1, 1) instead of new shape(999, 705, 1). I'm not sure that's the source of the problem but it'd be interesting to fix that. |
Also, could you try decreasing the learning rate and post a new plot the loss? Maybe divide it by 10. |
What is the value of LR? Any chance to have the full code so I can run it ? I think I just need FalseSamp, TrueSamp, FalseSampTest, TrueSampTest |
Right now LR is 0.001. Here is the main code. Im reading data from files like the one i attached. Each line in file is a set of coordinates with corresponding value. One file is one training sample. I've changed architecture of layers a bit as it gives slightly better results. Net net = new Net();
for (int k = 0; k < 200; k++)
|
I had simillar problems and net started working when I normalized input (0 - 1) |
Im trying to create a convolutional network. What am i doing wrong? it seems that there is no difference between training net with larger or smaller number of examples. Also can you tell me what kind of methods of training used for every type of network? I using your framework for research purposes and if you can give me references to papers or algorithms that you used that would be great.
The text was updated successfully, but these errors were encountered: