-
Notifications
You must be signed in to change notification settings - Fork 1.8k
BGP graceful restart timeout err on peer if perform warm-restart on SONiC device #2958
Description
Description
The warm-reboot testing failed on lastest master image: SONiC.HEAD.978-6d62249. On peer device, below BGP_GRACEFUL_RESTART_TIMEOUT error was observed while warm-reboot was performed on SONiC device.
Below is the log observed on peer device (Arista VM):
May 30 17:58:45 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer 10.0.0.56 (AS 65100) old state Established event Closed new state Idle
May 30 17:58:45 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer fc00::71 (AS 65100) old state Established event Closed new state Idle
May 30 18:00:45 ARISTA01T1 Rib: %BGP-5-BGP_GRACEFUL_RESTART_TIMEOUT: Deleting stale routes from peer 10.0.0.56 (AS 65100)
May 30 18:00:45 ARISTA01T1 Rib: %BGP-5-BGP_GRACEFUL_RESTART_TIMEOUT: Deleting stale routes from peer fc00::71 (AS 65100)
May 30 18:01:13 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer fc00::71 (AS 65100) old state OpenConfirm event RecvKeepAlive new state Established
May 30 18:01:13 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer 10.0.0.56 (AS 65100) old state OpenConfirm event RecvKeepAlive new state Established
May 30 18:01:16 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer fc00::71 (AS 65100) old state Established event Closed new state Idle
May 30 18:01:17 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer 10.0.0.56 (AS 65100) old state Established event Closed new state Idle
May 30 18:01:26 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer fc00::71 (AS 65100) old state OpenConfirm event RecvKeepAlive new state Established
May 30 18:01:28 ARISTA01T1 Rib: %BGP-5-ADJCHANGE: peer 10.0.0.56 (AS 65100) old state OpenConfirm event RecvKeepAlive new state Established
Before quagga was replaced with frr, the graceful restart time was configured to 240 seconds in #2754. However, this configuration was not in templates/frr.conf.j2. And the frr graceful restart time is default to 120 seconds. Actually, the frr takes longer than that to do graceful restart. This caused the warm-reboot testing failed.
Steps to reproduce the issue:
- BGP is configured between SONiC device and peer device (Aristra VM for example)
- Perform warm-reboot on SONiC
- Monitor the ip routes on peer device.
Describe the results you received:
When the BGP_GRACEFUL_RESTART_TIMEOUT error was observed on Arista VM, the ip routes learnt from SONiC was removed.
Describe the results you expected:
The routes should not be removed during warm-reboot.
Additional information you deem important (e.g. issue happens only occasionally):
Output of show version:
SONiC Software Version: SONiC.HEAD.978-6d62249
Distribution: Debian 9.9
Kernel: 4.9.0-8-2-amd64
Build commit: 6d62249
Build date: Sun May 26 13:48:25 UTC 2019
Built by: johnar@jenkins-worker-4
Attach debug file sudo generate_dump:
```
(paste your output here)
```