Hi, Dane
we meet garbage output issue, just connect hdmi cable of the monitor, and then garbage info displayed. after that, egl can’t init successfully.
i am not sure whether this issue related with desktop crash issue as below.
try to reproduce this issue with performance governor and power/control as on.
另外,我们在使用x11-forwarding功能,使用display:0去初始化egl,用于使用camera进程间的零拷贝。
现在不确定是先出现桌面乱码的问题,进而导致display:0无法egl初始化;还是因为使用display:0去初始化egl,导致的桌面乱码问题。不过确实我这台一直尝试camera进程间零拷贝方案的机器出现桌面乱码的概率更大。图片上这个桌面图标乱码,当时并没有使用x11-forwarding功能。
请帮忙分析一下
Just a comment on forwarding. This probably is part of the issue.
Normally, when X11 uses direct rendering, it is using the local monitor’s EDID to determine specifications and set up rendering through the local GPU. As soon as you use X11 forwarding it is the GPU at the remote end (where the monitor is, remote from the Jetson, local to your work station) performing all of the rendering work. The EDID for monitor specs is no longer from the Jetson.
One limitation of this is that if the GPU doing the rendering is not the one you expect, then the libraries and support properties for direct rendering also go to the local machine; the Jetson itself would be remote, and the Jetson might not even need all of its EGL drivers. It is true that the direct rendering functions which do not need display do operate on the Jetson, but most of what would have occurred on the Jetson now occur on the local machine.
I highly doubt that X11 forwarding in any way allows zero copy the way you are hoping it will work. The non-rendering code could zero copy, but then you’re copying everything again over the network. Does it work if it is all local and without forwarding?
we use nvbuffer+NvBufSurfaceImport to fulfil zero-copy function,but not use redering function,just get the cuda address through egl map, i think there is no copy over the network.
yes, it work with all local and without forwarding. we don’t use x11 now, it works with below initialize.
it also work with pure ssh without x11 forwarding.
at before, we use x11 forwarding, when desktop garbage output issue happen,egl can’t initialize successfully.
but i am not sure whether desktop garbage output issue affect below egl initialize logic without x11 forwarding?
static bool
display_initialize(context_t * ctx)
{
/* Create EGL renderer */
ctx->renderer = NvEglRenderer::createEglRenderer("renderer0",
ctx->cam_w, ctx->cam_h, 0, 0);
if (!ctx->renderer)
ERROR_RETURN("Failed to create EGL renderer");
ctx->renderer->setFPS(ctx->fps);
if (ctx->enable_cuda)
{
/* Get defalut EGL display */
ctx->egl_display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
if (ctx->egl_display == EGL_NO_DISPLAY)
ERROR_RETURN("Failed to get EGL display connection");
/* Init EGL display connection */
if (!eglInitialize(ctx->egl_display, NULL, NULL))
ERROR_RETURN("Failed to initialize EGL display connection");
}
return true;
}
Hi,
Please share the steps to replicate it on developer kit. So that we can set up and check. So far we have not seen the issue on developer kit. Would need to replicate it first.
Hi,
below two ways can reproduce this issue.
- power off, power on the device
- plug out, plug in the hdmi cable
Can you monitor “dmesg --follow”, and then show on this forum what the output is from those logs which result from the unplug and replug of HDMI?
Hi,
We tried HDMI hotplug on AGX Thor developer kit and did not hit the issue. Do you observe it on developer kit? Or only the custom board?
dmesg —follow have no output when unplug and replug the HDMI.
we reproduce this issue about 5 times at our custom board。as devkit can’t use our camera, we didn’t do many test on devkit。
when this issue happen, log out the system and log in, this issue disappear。
attach is the demsg when this issue happen.
dmesg.txt (130.7 KB)
when this issue happen, syslog have some output as below shows
2025-10-27T15:34:04.095529+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: disconnected
2025-10-27T15:34:04.095714+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: Internal TMDS
2025-10-27T15:34:04.095754+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
2025-10-27T15:34:04.095784+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.101088+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: disconnected
2025-10-27T15:34:04.101140+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: Internal DisplayPort
2025-10-27T15:34:04.101180+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: 2380.0 MHz maximum pixel clock
2025-10-27T15:34:04.101209+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.101566+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: disconnected
2025-10-27T15:34:04.101609+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: Internal TMDS
2025-10-27T15:34:04.101631+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
2025-10-27T15:34:04.101652+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.113879+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: disconnected
2025-10-27T15:34:04.113965+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: Internal TMDS
2025-10-27T15:34:04.113993+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
2025-10-27T15:34:04.114011+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.114895+08:00 tegra-ubuntu-pos mirobot-sshcli: send_command:mlk cat /sys/bus/iio/devices/iio:device0/in_temp0_raw
2025-10-27T15:34:04.123242+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (II) NVIDIA(0): Setting mode “NULL”
2025-10-27T15:34:04.126584+08:00 tegra-ubuntu-pos mirobot-sshcli: receive_result:2683
2025-10-27T15:34:04.147662+08:00 tegra-ubuntu-pos gsd-media-keys[3007]: Unable to get default sink
2025-10-27T15:34:04.205473+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: disconnected
2025-10-27T15:34:04.205561+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: Internal DisplayPort
2025-10-27T15:34:04.205646+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-0: 2380.0 MHz maximum pixel clock
2025-10-27T15:34:04.205677+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.205992+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: disconnected
2025-10-27T15:34:04.206035+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: Internal TMDS
2025-10-27T15:34:04.206064+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
2025-10-27T15:34:04.206091+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
2025-10-27T15:34:04.218562+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: disconnected
2025-10-27T15:34:04.218607+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: Internal TMDS
2025-10-27T15:34:04.218638+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
2025-10-27T15:34:04.218665+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0):
seems when garbage output issue happen, gdm not in connect status. but the monitor is working.
please check attach’s syslog.
syslog.txt (2.2 MB)
No dmesg implies you lack hot plug detect. This is either a physical wire failing, or else a device tree being incorrect for this carrier board (not correctly routing all of the HDMI). Is this a developer’s kit using the default firmware/device-tree? There is no way to handle the HDMI correctly during a hot plug event without detecting the event.
Is there a specific action triggering the log with content such as this?
2025-10-27T15:34:04.095529+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: disconnected
2025-10-27T15:34:04.095714+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: Internal TMDS
2025-10-27T15:34:04.095754+08:00 tegra-ubuntu-pos /usr/libexec/gdm-x-session[2448]: (–) NVIDIA(GPU-0): DFP-2: 165.0 MHz maximum pixel clock
I ask because drivers and software can set up conditions when constructed or loading, as well as when failing. The point here is to differentiate between a hardware event (unplug/replug) and a software event (normal driver messages due to state change). Because of the missing dmesg I am leaning towards a hardware or firmware issue.
Incidentally, is this an ordinary HDMI monitor without any adapters, e.g., without a VGA-to-HDMI adapter?
Hi, thanks for the help.
1. Is this a developer’s kit using the default firmware/device-tree?
yes, we use the default firmware/device-tree. devkit also don’t have any dmesg when plug-in/out hdmi
2.Is there a specific action triggering the log with content such as this?
there is no action. i am also confused of this log, Maybe the HDMI cable is not tight.
3.is this an ordinary HDMI monitor without any adapters, e.g., without a VGA-to-HDMI adapter?
we use ordinary HDMI monitor without any adapters.
sorry, i attach a wrong syslog, please ignore above syslog.
i will reproduce this issue, and supply the right log with the exact time point when issue happen.
If possible have a serial console on another computer running. Then save that log when there is a failure.
Hi,
We would need a way to replicate it on developer kit. Or it is difficult for us to debug further. The issue is about video output and should not be related to camera input. Would be great if you can try to reproduce it on developer kit.
which serial console do you want to connect? i try devkit serial, when plug in/out hdmi, nothing output on the serial console.
i can’t reproduce this issue stably. if we reproduce it , which log do you want to capture?
Serial console is no different than regular console when monitoring logs. One of several reasons people might use this would include the fact that it can run and show logs while the desktop shows garbage. So any logs occurring as that happens are also relevant, it isn’t just during unplug/replug (which does indicate hot plug detect is completely missing; this is a show stopper issue).
The other reason people often use serial console, other than working when there is no GUI or monitor, is that it retains logs after the full system has more or less crashed and burned. Any time the serial console can capture boot logs or any logs into some condition which makes the system otherwise inaccessible you’ve gained a huge advantage in figuring out what is going wrong. If lucky, then you can also run commands like “lsmod” or “dmesg” when the system is otherwise hung up, and these too go into the serial console (which can have logging turned on).
