Description
xcvrd daemon part of pmon container is crashing.
Tested on SONIC Jenkins 327 image.
Steps to reproduce the issue:
- Load the latest SONIC image from Jenkins.
- Get into pmon docker container bash by executing "docker exec -it pmon bash"
- execute "ps ax" inside the pmon container
- Observe that xcvrd is not running.
Describe the results you received:
root@sonic:/home/admin# docker exec -it pmon bash
root@sonic:/# ps ax
PID TTY STAT TIME COMMAND
1 pts/0 Ss+ 0:00 /usr/bin/python /usr/bin/supervisord
13 pts/0 S 0:00 python /usr/bin/supervisor-proc-exit-listener --conta
18 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE
25 pts/0 S 0:00 /usr/bin/python /usr/bin/ledd
27 pts/0 S 0:00 /usr/bin/python /usr/bin/psud
28 pts/0 S 0:00 /usr/bin/python /usr/bin/syseepromd
93 pts/1 Ss 0:00 bash
98 pts/1 R+ 0:00 ps ax
Describe the results you expected:
root@sonic:/home/admin# docker exec -it pmon bash
root@sonic:/usr/bin# ps ax
PID TTY STAT TIME COMMAND
1 pts/0 Ss+ 0:00 /usr/bin/python /usr/bin/supervisord
13 pts/0 S 0:00 python /usr/bin/supervisor-proc-exit-listener --conta
17 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE
24 pts/0 S 0:00 /usr/bin/python /usr/bin/ledd
26 pts/0 S 0:00 /usr/bin/python /usr/bin/psud
27 pts/0 S 0:00 /usr/bin/python /usr/bin/syseepromd
96 pts/1 Ss 0:00 bash
109 pts/0 Sl 0:00 /usr/bin/python /usr/bin/xcvrd
145 pts/0 S 0:00 /usr/bin/python /usr/bin/xcvrd
146 pts/1 R+ 0:00 ps ax
Additional information you deem important (e.g. issue happens only occasionally):
we have narrowed the issue to particular commit.
$ git show ced0f7b
commit ced0f7b
Author: Joe LeVeque [email protected]
Date: Sat Jun 27 22:57:26 2020 -0700
In this commit, they have tweaked some transceiver info key names in xcvrd script.
In some places in xcvrd script,still the old names are used, hence xcvrd is crashing.
We tried to manually run xcvrd. Below is the result
root@sonic:/# xcvrd
Traceback (most recent call last):
File "/usr/bin/xcvrd", line 1171, in
main()
File "/usr/bin/xcvrd", line 1168, in main
xcvrd.run()
File "/usr/bin/xcvrd", line 1132, in run
self.init()
File "/usr/bin/xcvrd", line 1111, in init
post_port_sfp_dom_info_to_db(is_warm_start, self.stop_event)
File "/usr/bin/xcvrd", line 404, in post_port_sfp_dom_info_to_db
notify_media_setting(logical_port_name, transceiver_dict, app_port_tbl)
File "/usr/bin/xcvrd", line 630, in notify_media_setting
key = get_media_settings_key(physical_port, transceiver_dict)
File "/usr/bin/xcvrd", line 526, in get_media_settings_key
vendor_name_str = transceiver_dict[physical_port]['manufacturename']
KeyError: 'manufacturename'
**Output of `show version`:**
root@sonic:/home/admin# show version
SONiC Software Version: SONiC.master.327-dd4cf912
Distribution: Debian 10.4
Kernel: 4.19.0-6-2-amd64
Build commit: dd4cf91
Build date: Mon Jun 29 16:35:50 UTC 2020
Built by: johnar@jenkins-worker-4
**Attach debug file `sudo generate_dump`:**
```
(paste your output here)
```
Description
xcvrd daemon part of pmon container is crashing.
Tested on SONIC Jenkins 327 image.
Steps to reproduce the issue:
Describe the results you received:
root@sonic:/home/admin# docker exec -it pmon bash
root@sonic:/# ps ax
PID TTY STAT TIME COMMAND
1 pts/0 Ss+ 0:00 /usr/bin/python /usr/bin/supervisord
13 pts/0 S 0:00 python /usr/bin/supervisor-proc-exit-listener --conta
18 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE
25 pts/0 S 0:00 /usr/bin/python /usr/bin/ledd
27 pts/0 S 0:00 /usr/bin/python /usr/bin/psud
28 pts/0 S 0:00 /usr/bin/python /usr/bin/syseepromd
93 pts/1 Ss 0:00 bash
98 pts/1 R+ 0:00 ps ax
Describe the results you expected:
root@sonic:/home/admin# docker exec -it pmon bash
root@sonic:/usr/bin# ps ax
PID TTY STAT TIME COMMAND
1 pts/0 Ss+ 0:00 /usr/bin/python /usr/bin/supervisord
13 pts/0 S 0:00 python /usr/bin/supervisor-proc-exit-listener --conta
17 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE
24 pts/0 S 0:00 /usr/bin/python /usr/bin/ledd
26 pts/0 S 0:00 /usr/bin/python /usr/bin/psud
27 pts/0 S 0:00 /usr/bin/python /usr/bin/syseepromd
96 pts/1 Ss 0:00 bash
109 pts/0 Sl 0:00 /usr/bin/python /usr/bin/xcvrd
145 pts/0 S 0:00 /usr/bin/python /usr/bin/xcvrd
146 pts/1 R+ 0:00 ps ax
Additional information you deem important (e.g. issue happens only occasionally):
we have narrowed the issue to particular commit.
$ git show ced0f7b
commit ced0f7b
Author: Joe LeVeque [email protected]
Date: Sat Jun 27 22:57:26 2020 -0700
In this commit, they have tweaked some transceiver info key names in xcvrd script.
In some places in xcvrd script,still the old names are used, hence xcvrd is crashing.
We tried to manually run xcvrd. Below is the result
root@sonic:/# xcvrd
Traceback (most recent call last):
File "/usr/bin/xcvrd", line 1171, in
main()
File "/usr/bin/xcvrd", line 1168, in main
xcvrd.run()
File "/usr/bin/xcvrd", line 1132, in run
self.init()
File "/usr/bin/xcvrd", line 1111, in init
post_port_sfp_dom_info_to_db(is_warm_start, self.stop_event)
File "/usr/bin/xcvrd", line 404, in post_port_sfp_dom_info_to_db
notify_media_setting(logical_port_name, transceiver_dict, app_port_tbl)
File "/usr/bin/xcvrd", line 630, in notify_media_setting
key = get_media_settings_key(physical_port, transceiver_dict)
File "/usr/bin/xcvrd", line 526, in get_media_settings_key
vendor_name_str = transceiver_dict[physical_port]['manufacturename']
KeyError: 'manufacturename'
root@sonic:/home/admin# show version
SONiC Software Version: SONiC.master.327-dd4cf912
Distribution: Debian 10.4
Kernel: 4.19.0-6-2-amd64
Build commit: dd4cf91
Build date: Mon Jun 29 16:35:50 UTC 2020
Built by: johnar@jenkins-worker-4