Skip to content

[Bug]: TypeError: '>' not supported between instances of 'int' and 'NoneType' when using timeouts #1593

@karcaw

Description

@karcaw

Bug Description

If I run some slurm specs with:

buildtest build --limit 25 --timeout 3600 -t jobs

and i have not specified a maxpendtime anywhere in my configs, and the job is pending at the first polling interval I get this message and buildtest exits:

TypeError: '>' not supported between instances of 'int' and 'NoneType' 

If i remove the --timeout 3600 or specify a maxpendtime the code works correctly.

Steps to reproduce the error

run a slurm(or possibly other scheduler) spec with a timeout, and no maxpendtime value set on a system where the job will sit in pending state for at least 1 polling interval.

Since the maxpendtime is defaulted to 'None' when the code tries to compare it to the pending time it fails:

# if job state in PENDING check if we need to cancel job by checking internal timer
if builder.job.is_pending() or builder.job.is_suspended():
self.logger.debug(f"Time Duration: {builder.duration}")
self.logger.debug(f"Max Pend Time: {self.maxpendtime}")
# if timer exceeds 'maxpendtime' then cancel job
if int(builder.timer.duration()) > self.maxpendtime:

Version and HEAD commit

buildtest version 1.5
commit 4aa813c08dfef209e258adf24bdcfc68d2f029b8                                                                                                                                                                                                                           
Merge: dfd0416 18b8ef0                                                                                                                                                                                                                                                    
Author: Shahzeb Siddiqui <[email protected]>                                                                                                                                                                                                                     
Date:   Fri Aug 11 14:07:16 2023 -0400                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                          
    Merge pull request #1589 from buildtesters/fix_regtest_olcf                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                          
    Add pytest.skip to enforce NERSC tests dont run on OLCF

Relevant log output

2023-08-18 10:02:46,858 [slurm.py:132 -  poll() ] - [DEBUG] Querying JobID: '6843'  Job State by running: 'sacct -j 6843 -o State -n -X -P --clusters=sequim'                                                                                                             
2023-08-18 10:02:46,858 [slurm.py:135 -  poll() ] - [DEBUG] JobID: '6843' job state:PENDING                                                                                                                                                                               
2023-08-18 10:02:46,858 [slurm.py:128 -  poll() ] - [DEBUG] Time Duration: 0                                                                                                                                                                                              
2023-08-18 10:02:46,858 [slurm.py:129 -  poll() ] - [DEBUG] Max Pend Time: None

Post question in Slack

  • I agree that I posted my question in slack before creating this issue

Is there an existing issue

  • I confirm there is no existing issue for this issue

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions