[Service] Allow monit system tool to monitor the critical processes status running in various SONiC containers.#3878
[Service] Allow monit system tool to monitor the critical processes status running in various SONiC containers.#3878yozhao101 wants to merge 4 commits intosonic-net:masterfrom yozhao101:proc_stats
Conversation
…to monitor all the critical processes running in various containers. Specifically this configuration fill will let monit to detect whether a process is in the dead or live state. Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Yong Zhao <[email protected]>
| ############################################################################### | ||
|
|
||
| # For syncd container. | ||
| check process syncd matching "/usr/bin/syncd --diag" |
There was a problem hiding this comment.
This file will reside in the host OS, but all of the processes you're monitoring reside in containers. To align with the containerized nature of SONiC, it makes more sense to have container-specific monit configuration reside with the container-specific files, thus the host OS configuration can be ignorant of any and all containers.
I suggest adding a file in each container which specifies the processes to monitor which run in that container, then use our existing mechanism to copy that file to the base image during image compilation. If you copy all the files to /etc/monit/conf.d/, as long as monitrc is configured to load from this directory (it is by default -- see here), it will load all of those config files when it starts up.
There was a problem hiding this comment.
Good suggestion! I got your point! Will do it. Thanks.
| #if does not exist then alert | ||
|
|
||
| # For DHCP_Relay container. | ||
| check process dhcprelay matching "/usr/sbin/dhcrelay" |
There was a problem hiding this comment.
SONiC can spawn multiple instances of dhcrelay. From our discussion offline, this will only generate a log message if all of the instances stop running. It will be far more beneficial to get notified if any of these processes stop running.
| if does not exist then alert | ||
|
|
||
| # For Teamd container. | ||
| check process teamd matching "/usr/bin/teamd -r -t" |
There was a problem hiding this comment.
SONiC can spawn multiple instances of teamd. From our discussion offline, this will only generate a log message if all of the instances stop running. It will be far more beneficial to get notified if any of these processes stop running.
…atically (#22523) #### Why I did it src/sonic-utilities ``` * ab4c548 - (HEAD -> master, origin/master, origin/HEAD) Vnet_name added in create_only patterns (#3878) (2 days ago) [miatttao] * 04a2005 - Fix pg-drop cmd issue on multi-asic (#3782) (5 days ago) [Zhixin Zhu] * 60c724c - [trim]: Add Packet Trimming CLI (#3830) (5 days ago) [Nazarii Hnydyn] * 05abc9e - Transceiver CLI changes to support DOM and STATUS table related changes (#3844) (5 days ago) [mihirpat1] * aa52db8 - Fix show queue counters to only run on default namespace (#3865) (10 days ago) [Vishal Prakash] ``` #### How I did it #### How to verify it #### Description for the changelog
…atically (#22614) #### Why I did it src/sonic-utilities ``` * 6ce5257 - (HEAD -> master, origin/master, origin/HEAD) Update ubuntu version in azure pipeline (#3891) (5 hours ago) [Vasundhara Volam] * c78e0f7 - (origin/202505) VNET CLI- ADD/DEL VNET, ADD/DEL VNET ROUTE, VRF BIND/UNBIND to consider VNET + ADDITIONAL MODIFICATIONS TO SHOW VNET CLIs (#3826) (7 days ago) [KavyaVaniBedida] * 1d6e050 - Revert "Vnet_name added in create_only patterns (#3878)" (#3879) (7 days ago) [miatttao] * 5db13c2 - skip pfcwd if disabled in golden_config (#3880) (8 days ago) [Dashuai Zhang] * 07b232a - Add GCU Support for SKU Mellanox-SN4280-O8C80 (#3871) (8 days ago) [Sai Rama Mohan Reddy S] ``` #### How I did it #### How to verify it #### Description for the changelog
…atically (#22664) #### Why I did it src/sonic-utilities ``` * 6964f652 - (HEAD -> 202505, origin/202505) add TH5-512 hwsku into gcu support list (#3898) (16 hours ago) [mssonicbld] * 1ec19bc3 - feat: support namespace arg for show bfd commands (#3892) (6 days ago) [mssonicbld] * 6f93c7ff - feat: support namespace arg for show mac (#3893) (6 days ago) [mssonicbld] * c78e0f7 - VNET CLI- ADD/DEL VNET, ADD/DEL VNET ROUTE, VRF BIND/UNBIND to consider VNET + ADDITIONAL MODIFICATIONS TO SHOW VNET CLIs (#3826) (13 days ago) [KavyaVaniBedida] * 1d6e050 - Revert "Vnet_name added in create_only patterns (#3878)" (#3879) (13 days ago) [miatttao] * 5db13c2 - skip pfcwd if disabled in golden_config (#3880) (2 weeks ago) [Dashuai Zhang] * 07b232a - Add GCU Support for SKU Mellanox-SN4280-O8C80 (#3871) (2 weeks ago) [Sai Rama Mohan Reddy S] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (#3847)" (#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (#3822)" (#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (sonic-net#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847)" (sonic-net#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (sonic-net#3822)" (sonic-net#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (sonic-net#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (sonic-net#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (sonic-net#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (sonic-net#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (sonic-net#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (sonic-net#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (sonic-net#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (sonic-net#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (sonic-net#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (sonic-net#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (sonic-net#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (sonic-net#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (sonic-net#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (sonic-net#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (sonic-net#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (sonic-net#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (sonic-net#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (sonic-net#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (sonic-net#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (sonic-net#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (sonic-net#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (sonic-net#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (sonic-net#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (sonic-net#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (sonic-net#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (sonic-net#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (sonic-net#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (sonic-net#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847)" (sonic-net#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (sonic-net#3822)" (sonic-net#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (sonic-net#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (sonic-net#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (sonic-net#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (sonic-net#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (sonic-net#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (sonic-net#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (sonic-net#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (sonic-net#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (sonic-net#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (sonic-net#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (sonic-net#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (sonic-net#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (sonic-net#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (sonic-net#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (sonic-net#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (sonic-net#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (sonic-net#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (sonic-net#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (sonic-net#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (sonic-net#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (sonic-net#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (sonic-net#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (sonic-net#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (sonic-net#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (sonic-net#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (sonic-net#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (sonic-net#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog Signed-off-by: Dawei Huang <[email protected]>
…lly (sonic-net#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (sonic-net#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847)" (sonic-net#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (sonic-net#3822)" (sonic-net#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (sonic-net#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (sonic-net#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (sonic-net#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (sonic-net#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (sonic-net#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (sonic-net#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (sonic-net#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (sonic-net#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (sonic-net#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (sonic-net#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (sonic-net#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (sonic-net#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (sonic-net#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (sonic-net#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (sonic-net#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (sonic-net#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (sonic-net#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (sonic-net#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (sonic-net#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (sonic-net#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (sonic-net#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (sonic-net#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (sonic-net#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (sonic-net#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (sonic-net#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (sonic-net#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (sonic-net#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog Signed-off-by: xiaweijiang <[email protected]>
…lly (sonic-net#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (sonic-net#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847)" (sonic-net#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (sonic-net#3822)" (sonic-net#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (sonic-net#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (sonic-net#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (sonic-net#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (sonic-net#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (sonic-net#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (sonic-net#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (sonic-net#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (sonic-net#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (sonic-net#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (sonic-net#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (sonic-net#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (sonic-net#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (sonic-net#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (sonic-net#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (sonic-net#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (sonic-net#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (sonic-net#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (sonic-net#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (sonic-net#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (sonic-net#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (sonic-net#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (sonic-net#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (sonic-net#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (sonic-net#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (sonic-net#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (sonic-net#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (sonic-net#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#24534) #### Why I did it src/sonic-swss ``` * dabbd57 - (HEAD -> master, origin/master, origin/HEAD) Disable test_Srv6MySidUNTunnelDscpMode and test_Srv6MySidUNTunnelDscpModeAmbiguity (sonic-net#4033) (31 hours ago) [Changrong Wu] * a7198d1 - Revert "Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847)" (sonic-net#4035) (2 days ago) [Ying Xie] * 83adbd9 - Revert "Avoid nhgroup update when mux state changes (sonic-net#3822)" (sonic-net#4030) (2 days ago) [Ying Xie] * f34f624 - Respect Cargo.lock for dependencies version (sonic-net#4028) (3 days ago) [Qi Luo] * d644d2e - [bulker] Add support for bulk set object attributes (sonic-net#3703) (4 days ago) [Nikola Dancejic] * eae91a2 - [Dash] Update ENI Based Forwarding Orchagent (sonic-net#3905) (9 days ago) [Vivek] * 48e28b6 - Populate the Voq system Port information for the local port when the Port is removed and created when the Speed is changed dynamically via GCU (sonic-net#3976) (9 days ago) [saksarav-nokia] * 4d39712 - [DPB]: Fix stale queue counter maps in COUNTERS_DB after port breakout (sonic-net#3982) (9 days ago) [Ravi Minnikanti(Marvell)] * e2cc8ce - Add support for platform based on Clounix asic (sonic-net#3846) (9 days ago) [clounix-sw] * 10df75b - Change DB that DPU orchagents listens to for all orchs (sonic-net#3827) (9 days ago) [prabhataravind] * a2decc5 - Support SAI_PORT_SERDES_ATTR_CUSTOM_COLLECTION (sonic-net#3764) (10 days ago) [longhuan-cisco] * c5caf50 - [SmartSwitch-HA] Support peer_ip update in ha set. (sonic-net#3964) (11 days ago) [dypet] * 7119c2b - Enable output queue for HFT (sonic-net#3962) (11 days ago) [Ze Gan] * 4c6457e - [SmartSwitch-HA] Set pending flags back to false. (sonic-net#3997) (12 days ago) [dypet] * 2ed250d - Set egress mirror headroom to 0 on SN6600 platform (sonic-net#4005) (12 days ago) [Stephen Sun] * 1c7ab03 - [HFT OTEL]: OTEL conversion init (sonic-net#3920) (12 days ago) [Janet Cui] * 7c9315a - [buffermgrd] Optimize fast-reboot startup (sonic-net#3952) (12 days ago) [Jianyue Wu] * 7d540cb - [fpmsyncd]: Fix uA SID programming for link-local adjacencies (sonic-net#3958) (12 days ago) [Carmine Scarpitta] * 8541200 - [vnetorch] missing handling of rx and tx interval of monitoring session (sonic-net#3878) (12 days ago) [Jing Zhang] * 46daad0 - [syncd] Fix the error log while running lua plugin (sonic-net#3974) (12 days ago) [Vivek] * 5671e08 - Orchagent changes needed to support single ASIC VOQ Fixed-System (sonic-net#3847) (2 weeks ago) [lakshmi-nexthop] * b017bd3 - Permanent isolate a fabric port if it repeatedly flapping. (sonic-net#3933) (2 weeks ago) [jfeng-arista] * b426b2b - Support checking capabilities of the mirror (sonic-net#3934) (2 weeks ago) [Stephen Sun] * 25647cd - [fpmsyncd]: Add Support for SRv6 VPN Route and PIC Context Processing (sonic-net#3605) (2 weeks ago) [Yuqing Zhao(Alibaba Inc)] * 820eb74 - Allow state db to take modified entries made to the tunnel decap table (sonic-net#3960) (2 weeks ago) [Dev Ojha] * 5685653 - Temporarily skip failing port tests to unblock pipeline runs (sonic-net#4010) (2 weeks ago) [prabhataravind] * a4ed959 - Avoid nhgroup update when mux state changes (sonic-net#3822) (3 weeks ago) [manamand2020] * 42929d8 - dot3 Stats collection (sonic-net#3615) (3 weeks ago) [Brad House - NextHop] * ffea522 - [portsorch] fix crash when number of PGs returned 0 (sonic-net#3966) (3 weeks ago) [Stepan Blyshchak] * ea54ff8 - [ci] Migrate agent pool from sonicbld-1es to sonicso1ES-amd64 (sonic-net#3987) (3 weeks ago) [Liu Shilong] * 0adab60 - [fpmsyncd] skip routes for eth1-midplane (sonic-net#3724) (3 weeks ago) [arista-nwolfe] ``` #### How I did it #### How to verify it #### Description for the changelog Signed-off-by: Feng Pan <[email protected]>
- What I did
I added a new configuration file called proc_stat which can enable monit to
detect the status of all critical processes running in various docker containers. Specifically,
monit will detect whether a process is in the dead or live state. If a process which
is monitored went into the dead state or it failed to start, then monit will write an
alert into syslog. Currently monit will periodically monitor all the process in the period
60 seconds. We can set a different value for this period.
- How I did it
I created this configuration file according to the file named monitrc.
- How to verify it
In the host, we can list the critical processes running in each docker container using
the command docker top container_id. After that, we deliberately kill a process
using the command kill -9 proc_id. Then we can review the syslog file and will see an
alert message showing process_name is not running.