Skip to content

rebalance never starts #7103

@senscarlos

Description

@senscarlos

With citus 11.3 I've added a node and triggered a rebalance. The rebalance has been scheduled correctly but never starts running despite having 1 runnable task (and 10 blocked ones).

I'm using the docker image citusdata/citus:11.3 in all nodes. The connection between the nodes works (primary is at 10.132.0.2):

SELECT * FROM citus_get_active_worker_nodes();
 node_name  | node_port
------------+-----------
 10.132.0.4 |      5432
 10.132.0.5 |      5432
(2 rows)

Command history:

staging=# SELECT * from citus_add_node('10.132.0.5', 5432);
 citus_add_node
----------------
             10
(1 row)

Time: 623.522 ms
staging=# SELECT citus_rebalance_start();
NOTICE:  Scheduled 10 moves as job 1
DETAIL:  Rebalance scheduled as background job
HINT:  To monitor progress, run: SELECT * FROM citus_rebalance_status();
 citus_rebalance_start
-----------------------
                     1
(1 row)

Time: 26.101 ms
staging=# SELECT * FROM citus_rebalance_status();
 job_id |   state   | job_type  |           description           | started_at | finished_at |                              details
--------+-----------+-----------+---------------------------------+------------+-------------+--------------------------------------------------------------------
      1 | scheduled | rebalance | Rebalance all colocation groups |            |             | {"tasks": [], "task_state_counts": {"blocked": 10, "runnable": 1}}
(1 row)

Time: 3.200 ms
staging=# SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'staging'
  AND pid <> pg_backend_pid();
 pg_terminate_backend
----------------------
 t
 t
 t
 t
 t
 t
 t
 t
(8 rows)
staging=# SELECT get_rebalance_table_shards_plan();
               get_rebalance_table_shards_plan
-------------------------------------------------------------
 (sensor_datapoint,102183,0,10.132.0.4,5432,10.132.0.5,5432)
 (sensor_datapoint,102182,0,10.132.0.2,5432,10.132.0.5,5432)
 (sensor_datapoint,102185,0,10.132.0.4,5432,10.132.0.5,5432)
 (sensor_datapoint,102184,0,10.132.0.2,5432,10.132.0.5,5432)
 (sensor_datapoint,102187,0,10.132.0.4,5432,10.132.0.5,5432)
 (sensor_datapoint,102186,0,10.132.0.2,5432,10.132.0.5,5432)
 (sensor_datapoint,102189,0,10.132.0.4,5432,10.132.0.5,5432)
 (sensor_datapoint,102188,0,10.132.0.2,5432,10.132.0.5,5432)
 (sensor_datapoint,102191,0,10.132.0.4,5432,10.132.0.5,5432)
 (sensor_datapoint,102190,0,10.132.0.2,5432,10.132.0.5,5432)
(10 rows)

Time: 4.475 ms
staging=# SELECT * from pg_dist_node;
 nodeid | groupid |  nodename  | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       0 | 10.132.0.2 |     5432 | default  | t           | t        | primary  | default     | t              | t
      6 |       5 | 10.132.0.4 |     5432 | default  | t           | t        | primary  | default     | t              | t
     10 |       9 | 10.132.0.5 |     5432 | default  | t           | t        | primary  | default     | t              | t
staging=# ALTER SYSTEM SET citus.max_background_task_executors_per_node = 2;
ALTER SYSTEM
Time: 9.613 ms
staging=# SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

Time: 1.585 ms
staging=# SELECT * FROM citus_rebalance_status() \gx
-[ RECORD 1 ]-------------------------------------------------------------------
job_id      | 1
state       | scheduled
job_type    | rebalance
description | Rebalance all colocation groups
started_at  |
finished_at |
details     | {"tasks": [], "task_state_counts": {"blocked": 10, "runnable": 1}}

Time: 3.033 ms

I've been waiting for a long time and nothing changes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions