Skip to content

Error at train_network() #209

@wmau

Description

@wmau

Your Operating system and DeepLabCut version
Windows 7, conda virtual environment imported into PyCharm IDE, DeepLabCut v2.0.4.1.

Describe the problem
Running train_network(path_config_file) gives me the following error message after "Starting training..." prints:

Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do_call
return fn(*args)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: resnet_v1_50/block4/unit_2/bottleneck_v1/conv1/weights/read/_1219 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_938_resnet_v1_50/block4/unit_2/bottleneck_v1/conv1/weights/read", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
[[Node: pose/part_pred/block4/stack/_1347 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1430_pose/part_pred/block4/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\IPython\core\interactiveshell.py", line 2847, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
dlc.train_network(path_config_file)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\deeplabcut\pose_estimation_tensorflow\training.py", line 81, in train_network
raise e
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\deeplabcut\pose_estimation_tensorflow\training.py", line 79, in train_network
train(str(poseconfigfile),displayiters,saveiters,maxiters,max_to_keep=max_snapshots_to_keep) #pass on path and file name for pose_cfg.yaml!
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\deeplabcut\pose_estimation_tensorflow\train.py", line 142, in train
feed_dict={learning_rate: current_lr})
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
run_metadata_ptr)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do_run
run_metadata)
File "C:\ProgramData\Anaconda3\envs\dlc-windowsGPU\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: resnet_v1_50/block4/unit_2/bottleneck_v1/conv1/weights/read/_1219 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_938_resnet_v1_50/block4/unit_2/bottleneck_v1/conv1/weights/read", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
[[Node: pose/part_pred/block4/stack/_1347 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1430_pose/part_pred/block4/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

How to Reproduce the problem
Run train_network(path_config_file).

Screenshots
N/A

Additional context
GPU: NVIDIA Quadro 410, 4095 MB RAM
CUDA v9.0
cuDNN v7.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions