Skip to content

WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss. #15716

@Gandalf401

Description

@Gandalf401

System information.
Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-74-generic x86_64)
TensorFlow 2.0.0 installed by anaconda
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: 11.2

Describe the problem.
Sometimes when I'm using tf.keras.layers.PReLU() with a constant initializer, it occurs that the gradients do not exist as the warning infomation says'WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.'. However, I'm not sure why it occurs as somtimes it occurs but somtimes not. Further more, I have used many other layers for testing with various strange structure, but the warning only appear on PReLU, so I suggest it has something wrong with PReLU, but not my network structure. Or there must be something special on PReLU. If I am wrong, please correct me, thanks!

Describe the current behavior.
Sometimes when I'm using tf.keras.layers.PReLU() with a constant initializer, it occurs that the gradients do not exist as the warning infomation says'WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.'.

Describe the expected behavior.
Gradients for tf.keras.layers.PReLU() should always exists.

Standalone code to reproduce the issue.
Here is a network structure in json that will produce the bug:
{
"input_shape": [28, 28, 1],
"network": [
{"name": "Squeeze", "params": {"tensor_space": 4, "dim": 3}},
{"name": "Threshold", "params": {"threshold": 0.9880963343359307}},
{"name": "PReLU", "params": {"init": 0.4469039064898337, "share": true}},
{"name": "BiasAdd", "params": {"bias": 0.021902653299097977}},
{"name": "Sqrt", "params": {}}, {"name": "Sqrt", "params": {}},
{"name": "Sqrt", "params": {}}, {"name": "Ceil", "params": {}},
{"name": "ReduceSum", "params": {"keep_dims": false, "dim": 2, "tensor_space": 4}},
{"name": "Softmax", "params": {"dim": -1}},
{"name": "GaussianNoise", "params": {"stddev": 0.0052523236865178015}},
{"name": "Dense", "params": {"in_features": 28, "out_features": 10}},
{"name": "Softmax"}
]
}
If there is not a layer class in tf.keras.layers related to the above ops, I encapsulate the function using tf.keras.layers.Layer, and the PReLU will be interpreted as:
initializer = tf.keras.initializers.Constant(alpha)
if share:
shared = [i + 1 for i in range(len(input_shape))]
else:
if len(input_shape) == 1:
shared = None
else:
shared = [i + 1 for i in range(len(input_shape) - 1)]
p_relu = tf.keras.layers.PReLU(alpha_initializer=initializer, shared_axes=shared)

The training process is implemented using standardized code for 'with tf.GradientTape() as tape:'.

Source code / logs.
WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.

Metadata

Metadata

Labels

type:supportUser is asking for help / asking an implementation question. Stackoverflow would be better suited.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions