Diluk
October 11, 2021, 9:50am
1
Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc)
V100
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Yolo_v3
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
random_seed: 42
yolov3_config {
big_anchor_shape: “[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]”
mid_anchor_shape: “[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]”
small_anchor_shape: “[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]”
matching_neutral_box_iou: 0.7
arch: “resnet”
nlayers: 18
arch_conv_blocks: 2
loss_loc_weight: 0.8
loss_neg_obj_weights: 100.0
loss_class_weights: 1.0
freeze_bn: false
#freeze_blocks: 0
force_relu: false
}
training_config {
batch_size_per_gpu: 16
num_epochs: 500
enable_qat: false
checkpoint_interval: 10
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 1e-6
max_learning_rate: 1e-4
soft_start: 0.1
annealing: 0.5
}
}
regularizer {
type: L1
weight: 3e-5
}
optimizer {
adam {
epsilon: 1e-7
beta1: 0.9
beta2: 0.999
amsgrad: false
}
}
pretrain_model_path: “EXPERIMENT_DIR/pretrained_resnet18/pretrained_object_detection_vresnet18/resnet_18.hdf5”
}
eval_config {
average_precision_mode: SAMPLE
batch_size: 8
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
top_k: 200
}
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 1248
output_height: 384
output_channel: 3
randomize_input_shape_period: 0
}
dataset_config {
data_sources: {
label_directory_path: “/workspace/tao-experiments/data/training/label_2”
image_directory_path: “/workspace/tao-experiments/data/training/image_2”
}
include_difficult_in_training: true
target_class_mapping {
key: “ball”
value: “ball”
}
target_class_mapping {
key: “bottle”
value: “bottle”
}
target_class_mapping {
key: “grass”
value: “grass”
}
target_class_mapping {
key: “leaf”
value: “leaf”
}
target_class_mapping {
key: “milk-box”
value: “milk-box”
}
target_class_mapping {
key: “plastic-bag”
value: “plastic-bag”
}
validation_data_sources: {
label_directory_path: “/workspace/tao-experiments/data/val/label”
image_directory_path: “/workspace/tao-experiments/data/val/image”
}
}
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Hi,guys:
The following error occurred when using tao-converter to convert yolov3 weights:
I understand that the -d parameter is no longer necessary, any ideas?
Can you check if the etlt file is available?
! tao yolo_v3 run ls $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt
Diluk
October 12, 2021, 2:44am
3
It works, thank you. But when I use tao-converter on xavier, the following error appears:
-d parameter is the same as the one used during training
It is SSD model while previously it is a yolo_v3 model.
For the error, please build TRT OSS plugin. You can refer to YOLOv4 — TAO Toolkit 3.22.05 documentation
Diluk
October 12, 2021, 6:54am
5
According to the documentation to build the TRT OSS plugin and replace it, I tried to convert the yolov3 model, and the following error occurred:
Sorry, I am afraid you are using v100 instead of Jetson devices. Did you follow YOLOv4 — TAO Toolkit 3.22.05 documentation ?
Diluk
October 12, 2021, 7:26am
7
There is no doubt that I am using jetson. I replaced TRT OSS according to that document:
According to the description above, it is a v100 machine, right?
Diluk
October 12, 2021, 7:34am
9
Yes, used to train the model. Now I want to deploy to DeepStream in jetson device.
Morganh
October 12, 2021, 7:53am
10
OK, can you share the result of below?
$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*
Morganh
October 12, 2021, 8:29am
12
I am afraid you did not replace the plugin correctly.
The expected is as below.
$ ll /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so*
lrwxrwxrwx 1 root root 26 6月 6 2020 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so → libnvinfer_plugin.so.7.1.3*
lrwxrwxrwx 1 root root 26 10月 12 15:12 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7 → libnvinfer_plugin.so.7.1.3*
lrwxrwxrwx 1 root root 26 10月 12 15:12 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.0.0 → libnvinfer_plugin.so.7.1.3*
-rwxr-xr-x 1 root root 10009144 10月 12 15:06 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.1.3*
Please follow step 4 of YOLOv4 — TAO Toolkit 3.0 documentation
Diluk
October 12, 2021, 8:54am
13
I follow step 4 replace the plugin correctly, but the error still exists:
Morganh
October 12, 2021, 9:29am
14
Can you try official demo etlt model file?
wget https://nvidia.box.com/shared/static/i1cer4s3ox4v8svbfkuj5js8yqm3yazo.zip -O models.zip
then,
./tao-converter -k nvidia_tlt -d 3,544,960 -p image_input,1x3x544x960,1x3x544x960,1x3x544x960 -o BatchedNMS -e /export/trt.fp16.engine -t fp16 -i nchw -m 8 yolov4_resnet18.etlt
Diluk
October 13, 2021, 2:34am
15
Try the official demo etlt model file and still get the following error
But I successfully converted the ssd model file under the same conditions and deployed it to deepstream
Morganh
October 13, 2021, 2:37am
16
Please modify
-p image_input,
to
-p Input,
Morganh
October 13, 2021, 2:43am
18
Can you run
$ md5sum yolov4_resnet18.etlt
Morganh
October 13, 2021, 2:49am
20
Can you double check?
On my side, the generation is successful in NX.
$ ./tao-converter -k nvidia_tlt -d 3,544,960 -p Input,1x3x544x960,1x3x544x960,1x3x544x960 -o BatchedNMS -e /export/trt.fp16.engine -t fp16 -i nchw -m 8 yolov4_resnet18.etlt
Yesterday, another forum user also ran it successfully with this yolo_v4_resnet18.etlt.
See Error in Yolov4 engine conversion, - #41 by Morganh