WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.

**System information**.
Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-74-generic x86_64)
TensorFlow 2.0.0 installed by anaconda
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: 11.2

**Describe the problem**.
Sometimes when I'm using tf.keras.layers.PReLU() with a constant initializer, it occurs that the gradients do not exist as the warning infomation says'WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.'. However, I'm not sure why it occurs as somtimes it occurs but somtimes not. Further more, I have used many other layers for testing with various strange structure, but the warning only appear on PReLU, so I suggest it has something wrong with PReLU, but not my network structure. Or there must be something special on PReLU. If I am wrong, please correct me, thanks!

**Describe the current behavior**.
Sometimes when I'm using tf.keras.layers.PReLU() with a constant initializer, it occurs that the gradients do not exist as the warning infomation says'WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.'. 

**Describe the expected behavior**.
Gradients for tf.keras.layers.PReLU() should always exists.

**Standalone code to reproduce the issue**.
Here is a network structure in json that will produce the bug:
{
	"input_shape": [28, 28, 1], 
	"network": [
		{"name": "Squeeze", "params": {"tensor_space": 4, "dim": 3}}, 
		{"name": "Threshold", "params": {"threshold": 0.9880963343359307}}, 
		{"name": "PReLU", "params": {"init": 0.4469039064898337, "share": true}}, 
		{"name": "BiasAdd", "params": {"bias": 0.021902653299097977}}, 
		{"name": "Sqrt", "params": {}}, {"name": "Sqrt", "params": {}}, 
		{"name": "Sqrt", "params": {}}, {"name": "Ceil", "params": {}}, 
		{"name": "ReduceSum", "params": {"keep_dims": false, "dim": 2, "tensor_space": 4}}, 
		{"name": "Softmax", "params": {"dim": -1}}, 
		{"name": "GaussianNoise", "params": {"stddev": 0.0052523236865178015}}, 
		{"name": "Dense", "params": {"in_features": 28, "out_features": 10}}, 
		{"name": "Softmax"}
	]
}
If there is not a layer class in tf.keras.layers related to the above ops, I encapsulate the function using tf.keras.layers.Layer, and the PReLU will be interpreted as:
initializer = tf.keras.initializers.Constant(alpha)
if share:
    shared = [i + 1 for i in range(len(input_shape))]
else:
    if len(input_shape) == 1:
        shared = None
    else:
        shared = [i + 1 for i in range(len(input_shape) - 1)]
p_relu = tf.keras.layers.PReLU(alpha_initializer=initializer, shared_axes=shared)

The training process is implemented using standardized code for 'with tf.GradientTape() as tape:'.

**Source code / logs**.
WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss. #15716

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

WARNING:tensorflow:Gradients do not exist for variables ['p_re_lu/alpha:0'] when minimizing the loss. #15716

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions