Skip to content

Conversation

xal-0
Copy link
Member

@xal-0 xal-0 commented Jul 30, 2025

Attempt to fix flakiness (https://buildkite.com/julialang/julia-master/builds/49532#01985bab-e15a-4efb-81a4-d53fdae669d4) by blocking on REPLCompletions.cache_PATH() in the tests that rely on it.

Just block on REPLCompletions.cache_PATH() instead, which takes the appropriate
lock to safely update the cache already.

Co-authored-by: Gabriel Baraldi <[email protected]>
@xal-0 xal-0 added the test This change adds or pertains to unit tests label Jul 30, 2025
@xal-0 xal-0 requested a review from gbaraldi July 30, 2025 21:37
withenv("PATH" => string(tempdir(), ":", dir)) do
s = string("repl-completio")
test_only_arm_cache_refresh()
c,r = test_scomplete(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defeats the purpose of this test.

Keno added a commit that referenced this pull request Jul 31, 2025
This is an alternative to #59161, attempting to fix the same observed
CI behavior. I don't think #59161 is the best way to fix it, as the
point of these tests is to make sure that REPL completions looks
up the PATH internally. Calling the path update function explicitly
defeats that somewhat. The extra synchronization here to make this
deterministic is messy, but I do think it makes the test closer to
real-world usage.

The core attempted fix here is to move the read of the PATH_ locals
inside `maybe_spawn_cache_PATH` into the locked region. If they
are not under the lock, they could be unconditionally overwritten
by a second call to this function, causing issues in the state machine.
I do not know whether this is the cause of the observed CI hangs, but
it's worth fixing anyway.
Keno added a commit that referenced this pull request Jul 31, 2025
This is an alternative to #59161, attempting to fix the same observed
CI behavior. I don't think #59161 is the best way to fix it, as the
point of these tests is to make sure that REPL completions looks
up the PATH internally. Calling the path update function explicitly
defeats that somewhat. The extra synchronization here to make this
deterministic is messy, but I do think it makes the test closer to
real-world usage.

The core attempted fix here is to move the read of the PATH_ locals
inside `maybe_spawn_cache_PATH` into the locked region. If they
are not under the lock, they could be unconditionally overwritten
by a second call to this function, causing issues in the state machine.
I do not know whether this is the cause of the observed CI hangs, but
it's worth fixing anyway.
@Keno
Copy link
Member

Keno commented Jul 31, 2025

#59166 for my preferred alternative

Keno added a commit that referenced this pull request Jul 31, 2025
This is an alternative to #59161, attempting to fix the same observed CI
behavior. I don't think #59161 is the best way to fix it, as the point
of these tests is to make sure that REPL completions looks up the PATH
internally. Calling the path update function explicitly defeats that
somewhat. The extra synchronization here to make this deterministic is
messy, but I do think it makes the test closer to real-world usage.

The core attempted fix here is to move the read of the PATH_ locals
inside `maybe_spawn_cache_PATH` into the locked region. If they are not
under the lock, they could be unconditionally overwritten by a second
call to this function, causing issues in the state machine. I do not
know whether this is the cause of the observed CI hangs, but it's worth
fixing anyway.
@xal-0 xal-0 closed this Jul 31, 2025
KristofferC pushed a commit that referenced this pull request Sep 5, 2025
This is an alternative to #59161, attempting to fix the same observed CI
behavior. I don't think #59161 is the best way to fix it, as the point
of these tests is to make sure that REPL completions looks up the PATH
internally. Calling the path update function explicitly defeats that
somewhat. The extra synchronization here to make this deterministic is
messy, but I do think it makes the test closer to real-world usage.

The core attempted fix here is to move the read of the PATH_ locals
inside `maybe_spawn_cache_PATH` into the locked region. If they are not
under the lock, they could be unconditionally overwritten by a second
call to this function, causing issues in the state machine. I do not
know whether this is the cause of the observed CI hangs, but it's worth
fixing anyway.

(cherry picked from commit 9a16119)
KristofferC pushed a commit that referenced this pull request Sep 15, 2025
This is an alternative to #59161, attempting to fix the same observed CI
behavior. I don't think #59161 is the best way to fix it, as the point
of these tests is to make sure that REPL completions looks up the PATH
internally. Calling the path update function explicitly defeats that
somewhat. The extra synchronization here to make this deterministic is
messy, but I do think it makes the test closer to real-world usage.

The core attempted fix here is to move the read of the PATH_ locals
inside `maybe_spawn_cache_PATH` into the locked region. If they are not
under the lock, they could be unconditionally overwritten by a second
call to this function, causing issues in the state machine. I do not
know whether this is the cause of the observed CI hangs, but it's worth
fixing anyway.

(cherry picked from commit 9a16119)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test This change adds or pertains to unit tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants