in Education by
It's noticed by me that introduction of NAN S has been occurring frequently in training. I think that it's introduced because of weights in fully-connected/inner-product or convolution layer blowing up. So what is the reason behind the occurrence of NAN, is it because gradient computations blowing up or caused by the input data's nature or because of weight initialization (If this is the reason then why weight initialization have this much effect)? Hence, What is the most probable reason behind NANs occurring in the training? And what are some methods to fight this also when these methods are used? Select the correct answer from above options

1 Answer

0 votes
by
 
Best answer
There can be many causes for NAN S to occur during training, below are a few causes which I know: Gradient blow up It occurs when large gradients make the learning process off-track. To resolve: Decrease the base_lr at least by an order of magnitude. In case, you have several loss layers then just inspect the log to find which layer is causing the gradient to blow up and then decrease the loss_weight for that specific layer. Bad learning rate policy and params It occurs when caffe sometime fails to compute a valid learning rate and gets ‘inf’ or ‘NAN’ instead. To resolve: Fix all the parameters which are affecting the learning rate in the solver.prototxt file. Faulty loss function The computation of the loss in the loss layers may cause NAN to appear. To resolve: Add printout to the loss layer and debug the error. Faulty input It may also be caused if you have an input with NAN in it. When the learning process hits the faulty input, the output becomes NAN. To resolve: You can re-built the input datasets and ensure that your validation set does not have bad image files. You can also build a simple net which would read the input layer and would run through all the inputs and if it finds any one of them faulty, it will produce a Nan and then you can remove the inputs which are causing it. Hope this helps!

Related questions

0 votes
    I want to save the history to a file, in Keras I have model.fit history = model.fit(Q_train, W_train, ... =(Q_test, W_test)) Select the correct answer from above options...
asked Jan 24, 2022 in Education by JackTerrance
0 votes
    I was wondering if you creative minds out there could think of some situations or applications in the web environment ... AI in games. Select the correct answer from above options...
asked Jan 26, 2022 in Education by JackTerrance
0 votes
    I am trying to understand the role of the Flatten function in Keras. Below is my code, which is a simple two ... flatten it? Thanks! Select the correct answer from above options...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    I'm looking for some examples of robot/AI programming using Lisp. Are there any good online examples available ... in nature)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm teaching a kid programming, and am introducing some basic artificial intelligence concepts at the moment. To begin ... and boxes)? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I am searching for information on algorithms to process text sentences or to follow a structure when creating sentences ... be great. Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm looking to try and write a chess AI. Is there something I can use on the .NET framework (or maybe ... making a chess game? Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm writing a game that's a variant of Gomoku. Basically a tic tac toe on a huge board. Wondering if anyone ... [self put randomly]; } Select the correct answer from above options...
asked Feb 4, 2022 in Education by JackTerrance
0 votes
    I'm Working on document classification tasks in java. Both algorithms came highly recommended, what are the ... Processing tasks? Select the correct answer from above options...
asked Feb 2, 2022 in Education by JackTerrance
0 votes
    I am a little confused about the Hill Climbing algorithm. I want to "run" the algorithm until I found the ... question is too simple. Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    Everybody. I am entirely new to the topic of classification algorithms, and need a few good pointers about where to ... Hints, anyone? Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    How is the convolution operation carried out when multiple channels are present at the input layer? (e.g. RGB ... over several regions? Select the correct answer from above options...
asked Jan 29, 2022 in Education by JackTerrance
0 votes
    I just started with machine learning. I want to know about the applications of machine learning. I know we ... recent applications. Select the correct answer from above options...
asked Jan 26, 2022 in Education by JackTerrance
0 votes
    I'm learning the difference between the various machine learning algorithms. I understand that the implementations of ... for that? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
0 votes
    What is the role of Flatten in Keras. I am executing the code below and it's a two layered network. The ... output is already flat? Select the correct answer from above options...
asked Jan 25, 2022 in Education by JackTerrance
...