Improve reliability of scheduler tests #304

edwintorok · 2020-02-17T15:25:10Z

We got a failure in Jenkins:

ASSERT one_shot_success
--------------------------------------------------------------------------------
[failure] Error one_shot_success: expecting
true, got
false.

The test schedules a job for 1 second in the future and then checks that
it took (0.99, 2.01) s to run it.
But the scheduler, and indeed Thread.delay cannot make such guarantees.
It can only ensure that the job is scheduled at least 1 s in the future.
If the system is busy it can take 2s, 30s, 10m, etc. the kernel provides
no upper bound, this is not a real-time operating system.

Replace the strict check with a timeout based one instead: wait up to a
maximum allowed time, and then check how soon the callback actually got
invoked (i.e. that it didn't run too early).
We need the maximum allowed time because we don't want the test to be
stuck forever, but we also want the test to stop as soon as the callback
is executed, i.e. so it doesn't needlessly wait a minute on each test if
the scheduler actually finished in 1s.

Signed-off-by: Edwin Török [email protected]

lindig

Less code and more reliable.

lib_test/scheduler_test.ml

We got a failure in Jenkins: ``` ASSERT one_shot_success -------------------------------------------------------------------------------- [failure] Error one_shot_success: expecting true, got false. ``` The test schedules a job for 1 second in the future and then checks that it took `(0.99, 2.01) s` to run it. But the scheduler, and indeed Thread.delay cannot make such guarantees. It can only ensure that the job is scheduled *at least* 1 s in the future. If the system is busy it can take 2s, 30s, 10m, etc. the kernel provides no upper bound, this is not a real-time operating system. Replace the strict check with a timeout based one instead: wait up to a maximum allowed time, and then check how soon the callback actually got invoked (i.e. that it didn't run too early). We need the maximum allowed time because we don't want the test to be stuck forever, but we also want the test to stop as soon as the callback is executed, i.e. so it doesn't needlessly wait a minute on each test if the scheduler actually finished in 1s. Signed-off-by: Edwin Török <[email protected]>

edwintorok requested a review from lindig February 17, 2020 15:25

lindig approved these changes Feb 17, 2020

View reviewed changes

lippirk reviewed Feb 17, 2020

View reviewed changes

lib_test/scheduler_test.ml Outdated Show resolved Hide resolved

psafont approved these changes Feb 20, 2020

View reviewed changes

edwintorok force-pushed the private/edvint/scheduler branch from 3eaa923 to 8b97c21 Compare February 20, 2020 11:43

psafont merged commit eec30b6 into xapi-project:master Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve reliability of scheduler tests #304

Improve reliability of scheduler tests #304

Uh oh!

edwintorok commented Feb 17, 2020

Uh oh!

lindig left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Improve reliability of scheduler tests #304

Improve reliability of scheduler tests #304

Uh oh!

Conversation

edwintorok commented Feb 17, 2020

Uh oh!

lindig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants