Skip to content

Conversation

@zeripath
Copy link
Contributor

@zeripath zeripath commented May 7, 2021

Fix #15776

Signed-off-by: Andrew Thornton [email protected]

@zeripath zeripath added this to the 1.15.0 milestone May 7, 2021
@GiteaBot GiteaBot added the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label May 7, 2021
@silverwind
Copy link
Member

silverwind commented May 7, 2021

Did not resolve it for me, here is another failure with this PR applied:

$ while true; do go test -count=1 -run TestLevelQueue ./modules/queue; done
ok    code.gitea.io/gitea/modules/queue 0.450s
ok    code.gitea.io/gitea/modules/queue 0.455s
ok    code.gitea.io/gitea/modules/queue 0.458s
ok    code.gitea.io/gitea/modules/queue 0.467s
ok    code.gitea.io/gitea/modules/queue 0.489s
2021/05/07 20:43:52 ...les/queue/manager.go:134:Add() [T] Queue Manager registered: queue-1 (QID: 1)
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:96:Run() [D] level:  Starting
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:104:Run() [T] level:  Waiting til closed
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
2021/05/07 20:43:52 .../queue/workerpool.go:251:commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
2021/05/07 20:43:52 .../queue/workerpool.go:93:zeroBoost() [W] WorkerPool: 1 (for queue-1) has zero workers - adding 5 temporary workers for 5m0s
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
2021/05/07 20:43:52 .../queue/workerpool.go:420:doWork() [T] Handling: 1 data, [0xc000c7a1b0]
2021/05/07 20:43:52 .../queue/workerpool.go:420:doWork() [T] Handling: 1 data, [0xc00067c180]
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:106:Run() [T] level:  Waiting til done
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 .../queue/workerpool.go:396:doWork() [T] Worker shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:109:Run() [T] level:  Waiting til cleaned
2021/05/07 20:43:52 .../queue/workerpool.go:309:CleanUp() [T] WorkerPool: 1 CleanUp
2021/05/07 20:43:52 .../queue/workerpool.go:321:CleanUp() [T] WorkerPool: 1 CleanUp Done
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:186:Terminate() [D] level:  Closing with 2 tasks left in queue
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:191:Terminate() [D] level:  Terminated
2021/05/07 20:43:52 ...les/queue/manager.go:134:Add() [T] Queue Manager registered: queue-2 (QID: 2)
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:96:Run() [D] level:  Starting
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:104:Run() [T] level:  Waiting til closed
2021/05/07 20:43:52 .../queue/workerpool.go:251:commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:147:readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
2021/05/07 20:43:52 .../queue/workerpool.go:412:doWork() [T] Handling: 2 data, [0xc00058e378 0xc00058e5b8]
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:106:Run() [T] level:  Waiting til done
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:174:Terminate() [T] level:  Terminating
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:156:Shutdown() [T] level:  Shutting down
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:164:Shutdown() [D] level:  Shutdown
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:186:Terminate() [D] level:  Closing with 0 tasks left in queue
2021/05/07 20:43:52 ...ue/queue_bytefifo.go:191:Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.70s)
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.742s
FAIL
ok    code.gitea.io/gitea/modules/queue 0.438s
ok    code.gitea.io/gitea/modules/queue 0.447s
ok    code.gitea.io/gitea/modules/queue 0.438s
ok    code.gitea.io/gitea/modules/queue 0.447s
ok    code.gitea.io/gitea/modules/queue 0.496s

@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

Can you try on the wait-on-empty PR because I would hate to spend an evening fixing this if I have already fixed and prevented it there,

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath force-pushed the fix-15776-prevent-intermittent-testlevelqueue-failure branch from 169a702 to a2d3420 Compare May 7, 2021 18:54
@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

Ah it's a boost situation. I think there's a race with zeroWorkers and addWorkers - so you may get a boost before the initial workers have started. I will look at fixing this in my wait-on-empty branch.

@silverwind
Copy link
Member

silverwind commented May 7, 2021

Fails with similar error on that branch as well, just with different logging:

failure log on main
Add() [T] Queue Manager registered: queue-1 (QID: 1)
Run() [D] level:  Starting
Run() [T] level:  Waiting til closed
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
eroBoost() [W] WorkerPool: 1 (for queue-1) has zero workers - adding 5 temporary workers for 5m0s
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
doWork() [T] Handling: 1 data, [0xc000c7a1b0]
doWork() [T] Handling: 1 data, [0xc00067c180]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
Run() [T] level:  Waiting til cleaned
CleanUp() [T] WorkerPool: 1 CleanUp
CleanUp() [T] WorkerPool: 1 CleanUp Done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [D] level:  Closing with 2 tasks left in queue
Terminate() [D] level:  Terminated
Add() [T] Queue Manager registered: queue-2 (QID: 2)
un() [D] level:  Starting
Run() [T] level:  Waiting til closed
commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
doWork() [T] Handling: 2 data, [0xc00058e378 0xc00058e5b8]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Terminate() [D] level:  Closing with 0 tasks left in queue
Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.70s)
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:21:
          Error Trace:  queue_disk_test.go:21
                              workerpool.go:421
                              workerpool.go:276
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.742s
FAIL
failure log on wait-on-empty
Add() [T] Queue Manager registered: queue-1 (QID: 1)
Run() [D] level:  Starting
Run() [T] level:  Now running
commonRegisterWorkers() [T] WorkerPool: 1 (for queue-1) adding 1 workers with group id: 1
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
readToChan() [T] level:  Waiting on Empty
readToChan() [T] level:  Waiting on Empty
doWork() [T] Handling: 2 data, [0xc00068c0d8 0xc00068c210]
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
Run() [T] level:  Waiting til cleaned
CleanUp() [T] WorkerPool: 1 CleanUp
CleanUp() [T] WorkerPool: 1 CleanUp Done
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Add() [T] Queue Manager registered: queue-2 (QID: 2)
Run() [D] level:  Starting
Run() [T] level:  Now running
readToChan() [T] level : Task found: &queue.testData{TestString:"A", TestInt:1}
commonRegisterWorkers() [T] WorkerPool: 2 (for queue-2) adding 1 workers with group id: 1
eroBoost() [W] WorkerPool: 2 (for queue-2) has zero workers - adding 5 temporary workers for 5m0s
readToChan() [T] level : Task found: &queue.testData{TestString:"B", TestInt:2}
readToChan() [T] level:  Waiting on Empty
doWork() [T] Handling: 1 data, [0xc0005ac468]
doWork() [T] Handling: 1 data, [0xc000667e90]
Shutdown() [T] level:  Shutting down
Shutdown() [T] level:  Shutting down
Shutdown() [D] level:  Shutdown
Run() [T] level:  Waiting til done
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
doWork() [T] Worker shutting down
Terminate() [T] level:  Terminating
doWork() [T] Worker shutting down
Shutdown() [T] level:  Shutting down
Terminate() [T] level:  Terminating
Shutdown() [T] level:  Shutting down
Run() [T] level:  Waiting til cleaned
Terminate() [D] level:  Closing with 0 tasks left in queue
CleanUp() [T] WorkerPool: 2 CleanUp
CleanUp() [T] WorkerPool: 2 CleanUp Done
Terminate() [D] level:  Terminated
--- FAIL: TestLevelQueue (0.59s)
    queue_disk_test.go:20:
          Error Trace:  queue_disk_test.go:20
                              workerpool.go:408
                              workerpool.go:261
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
    queue_disk_test.go:20:
          Error Trace:  queue_disk_test.go:20
                              workerpool.go:408
                              workerpool.go:261
                              asm_amd64.s:1371
          Error:        Should be true
          Test:         TestLevelQueue
FAIL
FAIL  code.gitea.io/gitea/modules/queue 0.639s
FAIL

@zeripath
Copy link
Contributor Author

zeripath commented May 7, 2021

I've just pushed up a fix to wait-on-empty

@silverwind
Copy link
Member

That last change seems to have done it, 150 runs without failure 👍

@silverwind silverwind closed this May 7, 2021
@silverwind
Copy link
Member

To clear any misunderstandings, this PR does not fix it, but #15693 does, closing this one.

@zeripath zeripath deleted the fix-15776-prevent-intermittent-testlevelqueue-failure branch May 12, 2021 19:26
@go-gitea go-gitea locked and limited conversation to collaborators Oct 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

lgtm/need 1 This PR needs approval from one additional maintainer to be merged. type/testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

queue: Flaky test TestLevelQueue

4 participants