Skip to content

[Bug][T2?]: REBOOT_TYPE_POWEROFF reboot causes test failures on T2 as NTP slew doesn't recover for a while #16289

@Javier-Tan

Description

@Javier-Tan

Issue Description

When the chassis is rebooted through REBOOT_TYPE_POWEROFF (through PDUs), the internal clock gets stuck at the time it reboots (e.g., if it takes 5 minutes to boot and is restarted at 00:00, the clock would show 00:00 (or slightly before) when it comes back up at 00:05. This could be caused by RTC (?)).

The time will slowly correct itself by ntp slew as designed, but due to the length of time it takes for a chassis to reboot, it takes very long (and much longer than the wait_until time defined for this case in the reboot function)

Snippet responsible for handling this in reboot.py reboot function:

    # some device does not have onchip clock and requires obtaining system time a little later from ntp
    # or SUP to obtain the correct time so if the uptime is less than original device time, it means it
    # is most likely due to this issue which we can wait a little more until the correct time is set in place.
    if float(dut_uptime.strftime("%s")) < float(dut_datetime.strftime("%s")):
        logger.info('DUT {} timestamp went backwards'.format(hostname))
        wait_until(120, 5, 0, positive_uptime, duthost, dut_datetime)

    dut_uptime = duthost.get_up_time()

    assert float(dut_uptime.strftime("%s")) > float(dut_datetime.strftime("%s")), "Device {} did not reboot". \
        format(hostname)

Where dut_uptime is the DUTs latest reported startup time, and dut_datetime is the DUT reported time collected before the reboot

Any test using this reboot type should pass UNLESS this is a safeguard that DUTs shouldn't behave this way, in this case the image needs to be fixed.

On the test side:
We could either increase the wait_until time (will be very long), sync the datetime using sudo ntpdate <NTP_SERVER_IP> manually in the test after reboot, or skip this check for T2 devices

Results you see

Sample test_power_off_reboot.py failure:
Before test fails: (made sure to sync ntp before this)

Actual time: 2025-01-02 03:51:10.744581

timedatectl:
               Local time: Thu 2025-01-02 03:51:11 UTC
           Universal time: Thu 2025-01-02 03:51:11 UTC
                 RTC time: Thu 2025-01-02 03:51:10
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: n/a
          RTC in local TZ: no

show ntp:
MGMT_VRF_CONFIG is not present.
synchronised to unspecified at stratum 4 
   time correct to within 150 ms
   polling server every 64 s
     remote           refid      st t when poll reach   delay   offset   jitter
===============================================================================
*10.150.22.222   10.221.236.34    3 u    3   64  377   0.2086  -0.3139   0.0731

After test fails:

Actual time: 2025-01-02 03:57:48.520614

timedatectl:
               Local time: Thu 2025-01-02 03:56:44 UTC
           Universal time: Thu 2025-01-02 03:56:44 UTC
                 RTC time: Thu 2025-01-02 03:56:44
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: n/a
          RTC in local TZ: no

show ntp:
MGMT_VRF_CONFIG is not present.
synchronised to unspecified at stratum 4 
   time correct to within 65062 ms
   polling server every 64 s
     remote           refid      st t when poll reach   delay   offset   jitter
===============================================================================
*10.150.22.222   10.221.236.34    3 u   47   64   17   0.2465 64907.24  88.1042

Results you expected to see

As stated before:

Any test using this reboot type should pass UNLESS this is a safeguard that DUTs shouldn't behave this way, in this case the image needs to be fixed.

On the test side:
We could either increase the wait_until time (will be very long), sync the datetime using sudo ntpdate <NTP_SERVER_IP> manually in the test after reboot, or skip this check for T2 devices

Is it platform specific

generic

Relevant log output

N/A

Output of show version

N/A

Attach files (if any)

N/A

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions