-
Notifications
You must be signed in to change notification settings - Fork 974
fix: fix flake in TestExecutorAutostartSkipsWhenNoProvisionersAvailable #19478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
related to usage of time.Now() in MustWaitForProvisionerAvailable and the fact that UpdateProvisionerLastSeenAt can not use a time that is further in the past than the current LastSeenAt time Signed-off-by: Callum Styan <callumstyan@gmail.com>
require.NoError(t, err, "Error getting provisioner for workspace") | ||
require.Equal(t, p.LastSeenAt.Time.UnixNano(), staleTime.UnixNano()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be mindful of the fact you can't call require
methods in a goroutine other than the main test goroutine. They all call FailNow
, and from the documentation:
FailNow must be called from the goroutine running the test or benchmark function, not from other goroutines created during the test. Calling FailNow does not stop those other goroutines.
https://pkg.go.dev/testing#F.FailNow
You can call assert
functions, however!
Signed-off-by: Callum Styan <callumstyan@gmail.com>
provisioner Signed-off-by: Callum Styan <callumstyan@gmail.com>
transaction does not go through immediately Signed-off-by: Callum Styan <callumstyan@gmail.com>
require.Eventually(t, func() bool { | ||
// Use the same logic as hasValidProvisioner but expect false | ||
provisionerDaemons, err := db.GetProvisionerDaemonsByOrganization(ctx, database.GetProvisionerDaemonsByOrganizationParams{ | ||
OrganizationID: workspace.OrganizationID, | ||
WantTags: tags, | ||
}) | ||
if err != nil { | ||
return false | ||
} | ||
|
||
// Check if NO provisioners are active (all are stale or gone) | ||
for _, pd := range provisionerDaemons { | ||
if pd.LastSeenAt.Valid { | ||
age := checkTime.Sub(pd.LastSeenAt.Time) | ||
if age <= provisionerdserver.StaleInterval { | ||
return false // Found an active provisioner, keep waiting | ||
} | ||
} | ||
} | ||
return true // No active provisioners found | ||
}, testutil.WaitMedium, testutil.IntervalFast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest using methods in testutil/chan.go
and passing in a parent context in the method signature
The flake here had two causes:
Previously the test here was calling
coderdtest.MustWaitForProvisionersAvailable
which was usingtime.Now
rather than the next tick time like the realhasProvisionersAvailable
function does. Additionally, when usingUpdateProvisionerLastSeenAt
the underlying db query enforces that the time we're trying to setLastSeenAt
to cannot be older than the current value.I was able to reliably reproduce the flake by executing both the
UpdateProvisionerLastSeenAt
call andtickCh <- next
in their own goroutines, the former with a small sleep to reliably ensure we'd trigger the autobuild before we set theLastSeenAt
time. That's when I also noticed thatcoderdtest.MustWaitForProvisionersAvailable
was usingtime.Now
instead of the tick time. When I updated that function to take in a tick time + added a 2nd call toUpdateProvisionerLastSeenAt
to set an original non-stale time, we could then never get the test to pass because the later call to set the stale time would not actually modifyLastSeenAt
. On top of that, calling the provisioner daemons closer in the middle of the function doesn't really do anything of value in this test.The fix for the flake is to keep the go routines, ensuring there would be a flake if there was not a relevant fix, but to include the fix which is to ensure that we explicitly wait for the provisioner to be stale before passing the time to
tickCh
.