Skip to content

Conversation

@alphaprinz
Copy link
Contributor

@alphaprinz alphaprinz commented Dec 5, 2025

Describe the Problem

The bucket's capacity doesn't take into account data reduction, which is confusing.

Explain the Changes

We opted to remove the per-bucket capacity check and replace with backing store-level check (physical).

Issues: Fixed

https://issues.redhat.com/browse/DFBUGS-3671

Testing Instructions:

  1. Create a BS, preferably a small one..
  2. Fill BS to 80% capacity (possible with enough "s3 cp")
  3. An alert is triggered on the BS.

Note corresponding PR in operator for the rules change:
noobaa/noobaa-operator#1747

  • Doc added/updated
  • Tests added

Summary by CodeRabbit

  • New Features
    • Added backing-store capacity metrics to improve visibility of storage capacity across resources and pools.
    • Exposed an optional is_low_capacity flag in cloud pool stats so monitoring and UIs can mark resources operating in low-capacity modes.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 5, 2025

Walkthrough

Adds low-capacity reporting: a new Prometheus Gauge backing_store_low_capacity is declared and updated from resource stats; stats aggregation marks cloud pool resources with is_low_capacity when pool mode is TIER_LOW_CAPACITY or LOW_CAPACITY; API schema exposes an optional is_low_capacity boolean in cloud pool resources.

Changes

Cohort / File(s) Change Summary
Prometheus metric & resource status update
src/server/analytic_services/prometheus_reports/noobaa_core_report.js
Added backing_store_low_capacity Gauge (label backing_store_name) and reset/set it inside set_resource_status using each resource's is_low_capacity value.
Stats aggregation: low-capacity detection
src/server/system_services/stats_aggregator.js
Introduced LOW_CAPACITY_MODES = ['TIER_LOW_CAPACITY','LOW_CAPACITY'] and set is_low_capacity on pool resources when pool_info.mode is in that set (added to resource objects alongside existing health/status fields).
API schema: expose low-capacity flag
src/api/stats_api.js
Added optional boolean is_low_capacity to definitions.cloud_pool_stats.resources.items.properties so API responses may include the flag.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Review metric creation, label name (backing_store_name), and reset semantics in noobaa_core_report.js.
  • Verify LOW_CAPACITY_MODES covers intended modes and is_low_capacity is set for all code paths in stats_aggregator.js.
  • Confirm API schema change matches runtime payload and is non-breaking in stats_api.js.

Suggested labels

size/M

Suggested reviewers

  • nimrod-becker
  • naveenpaul1

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: replacing bucket capacity monitoring with backing store capacity monitoring as referenced in the issue identifier.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 964df01 and 1f2344b.

📒 Files selected for processing (2)
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js (2 hunks)
  • src/server/system_services/stats_aggregator.js (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-08T13:10:36.141Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1314-1317
Timestamp: 2025-08-08T13:10:36.141Z
Learning: In src/server/system_services/pool_server.js (and config usage), the constant config.INTERNAL_STORAGE_POOL_NAME has been removed from the system. Future logic should not depend on this constant and should instead use config.DEFAULT_POOL_NAME or structural markers (e.g., pool.resource_type === 'INTERNAL' or pool.mongo_info) to identify internal/mongo pools.

Applied to files:

  • src/server/system_services/stats_aggregator.js
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.

Applied to files:

  • src/server/system_services/stats_aggregator.js
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
📚 Learning: 2025-08-08T13:13:42.192Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1304-1307
Timestamp: 2025-08-08T13:13:42.192Z
Learning: In noobaa-core, do not rely on config.INTERNAL_STORAGE_POOL_NAME to detect backingstore pools. In src/server/system_services/pool_server.js, prefer checking pool.hosts_pool_info.backingstore or pool.cloud_pool_info.backingstore; fallback to prefix-matching the pool name against config.DEFAULT_POOL_NAME ('backingstores') to handle potential suffixes.

Applied to files:

  • src/server/system_services/stats_aggregator.js
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
🧬 Code graph analysis (1)
src/server/system_services/stats_aggregator.js (1)
src/server/system_services/pool_server.js (6)
  • _ (9-9)
  • _ (1175-1175)
  • _ (1203-1203)
  • pool_info (475-475)
  • pool_info (524-524)
  • pool_info (1187-1187)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run-package-lock-validation
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-jest-unit-tests
🔇 Additional comments (4)
src/server/analytic_services/prometheus_reports/noobaa_core_report.js (2)

632-632: LGTM!

The metric reset is correctly placed alongside the resource_status reset.


635-637: Ensure backing stores are properly identified.

The metric backing_store_capacity should only track backing store pools, not all pool resources. According to best practices in the codebase, verify that get_cloud_pool_stats correctly identifies backing stores by checking pool.hosts_pool_info.backingstore or pool.cloud_pool_info.backingstore rather than relying solely on pool naming patterns or is_default_pool checks.

src/server/system_services/stats_aggregator.js (2)

729-732: LGTM! Low capacity states appropriately separated from health states.

The LOW_CAPACITY_MODES constant correctly identifies capacity warning states. Note that these modes also appear in OPTIMAL_MODES (lines 720-728), which is intentional: a pool in low capacity is still considered healthy but requires capacity attention. This separation aligns with the PR objective to track physical capacity independently.


798-802: Verify that only backing store pools are included in the resources array.

The backing_store_capacity metric should only track backing store resources. Confirm that the filtering logic (skipping default pools) is sufficient to ensure only backing stores are added to the resources array. If not all non-default pools are backing stores, add explicit checks using pool.cloud_pool_info.backingstore or pool.hosts_pool_info.backingstore fields to properly identify backing store pools before including them in the metrics.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/server/system_services/stats_aggregator.js (1)

720-732: Clarify low‑capacity modes and consider reusing constants / explicit numeric value

The new LOW_CAPACITY_MODES subset and is_low_capacity flag look reasonable, but a couple of points are worth double‑checking:

  • The metric will only treat TIER_LOW_CAPACITY and LOW_CAPACITY as “low capacity”. If modes like TIER_NO_CAPACITY / NO_CAPACITY or TIER_NO_CAPACITY are supposed to trigger the same Prometheus alert, you may want to either include them here or rename the metric/flag to reflect that it’s strictly “low but not no capacity”. Please confirm this matches the alerting rules in the operator side.
  • resources.is_low_capacity ends up feeding a Prometheus Gauge. Even though JS will coerce booleans to 0/1, explicitly converting (e.g. Number(_.includes(LOW_CAPACITY_MODES, pool_info.mode))) would make the intent clearer and more robust if the Prom client ever tightens type checks.
  • Optional: we now have mode lists in a few places in this file; if they need to stay in sync with central definitions, consider hoisting shared mode sets to module‑level constants to avoid subtle drift over time.

Also applies to: 798-802

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f2344b and 8546050.

📒 Files selected for processing (2)
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js (2 hunks)
  • src/server/system_services/stats_aggregator.js (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-08T13:10:36.141Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1314-1317
Timestamp: 2025-08-08T13:10:36.141Z
Learning: In src/server/system_services/pool_server.js (and config usage), the constant config.INTERNAL_STORAGE_POOL_NAME has been removed from the system. Future logic should not depend on this constant and should instead use config.DEFAULT_POOL_NAME or structural markers (e.g., pool.resource_type === 'INTERNAL' or pool.mongo_info) to identify internal/mongo pools.

Applied to files:

  • src/server/system_services/stats_aggregator.js
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.

Applied to files:

  • src/server/system_services/stats_aggregator.js
📚 Learning: 2025-08-08T13:13:42.192Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1304-1307
Timestamp: 2025-08-08T13:13:42.192Z
Learning: In noobaa-core, do not rely on config.INTERNAL_STORAGE_POOL_NAME to detect backingstore pools. In src/server/system_services/pool_server.js, prefer checking pool.hosts_pool_info.backingstore or pool.cloud_pool_info.backingstore; fallback to prefix-matching the pool name against config.DEFAULT_POOL_NAME ('backingstores') to handle potential suffixes.

Applied to files:

  • src/server/system_services/stats_aggregator.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run-jest-unit-tests
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-package-lock-validation

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8546050 and 10aa010.

📒 Files selected for processing (3)
  • src/api/stats_api.js (1 hunks)
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js (2 hunks)
  • src/server/system_services/stats_aggregator.js (2 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-08-08T13:10:36.141Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1314-1317
Timestamp: 2025-08-08T13:10:36.141Z
Learning: In src/server/system_services/pool_server.js (and config usage), the constant config.INTERNAL_STORAGE_POOL_NAME has been removed from the system. Future logic should not depend on this constant and should instead use config.DEFAULT_POOL_NAME or structural markers (e.g., pool.resource_type === 'INTERNAL' or pool.mongo_info) to identify internal/mongo pools.

Applied to files:

  • src/server/system_services/stats_aggregator.js
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.

Applied to files:

  • src/server/system_services/stats_aggregator.js
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
📚 Learning: 2025-08-08T13:13:42.192Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1304-1307
Timestamp: 2025-08-08T13:13:42.192Z
Learning: In noobaa-core, do not rely on config.INTERNAL_STORAGE_POOL_NAME to detect backingstore pools. In src/server/system_services/pool_server.js, prefer checking pool.hosts_pool_info.backingstore or pool.cloud_pool_info.backingstore; fallback to prefix-matching the pool name against config.DEFAULT_POOL_NAME ('backingstores') to handle potential suffixes.

Applied to files:

  • src/server/system_services/stats_aggregator.js
  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
📚 Learning: 2025-08-11T06:12:12.318Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:6-22
Timestamp: 2025-08-11T06:12:12.318Z
Learning: In the noobaa-core upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js, bucket migration from the internal mongo pool to a new default pool is planned to be handled in separate future PRs with comprehensive testing, rather than being included directly in the pool removal script.

Applied to files:

  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
📚 Learning: 2025-08-08T13:10:57.608Z
Learnt from: naveenpaul1
Repo: noobaa/noobaa-core PR: 9182
File: src/server/system_services/pool_server.js:1308-1311
Timestamp: 2025-08-08T13:10:57.608Z
Learning: In noobaa-core, config.INTERNAL_STORAGE_POOL_NAME has been removed. Use config.DEFAULT_POOL_NAME for internal/backingstore pool identification. Prefer exact match first, with prefix fallback to handle potential system-id suffixes in pool names.

Applied to files:

  • src/server/analytic_services/prometheus_reports/noobaa_core_report.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run-jest-unit-tests
  • GitHub Check: run-package-lock-validation
  • GitHub Check: Build Noobaa Image
🔇 Additional comments (4)
src/api/stats_api.js (1)

450-453: LGTM!

The schema addition for is_low_capacity is well-structured and follows the existing pattern of the is_healthy field. Making it optional ensures backward compatibility.

src/server/analytic_services/prometheus_reports/noobaa_core_report.js (1)

631-638: LGTM!

The metric reset and update logic is correctly implemented. The use of Number() to convert the boolean is_low_capacity to 0/1 is appropriate for Prometheus metrics.

src/server/system_services/stats_aggregator.js (2)

729-732: LGTM!

The LOW_CAPACITY_MODES constant correctly identifies the pool modes that indicate low capacity. Note that these modes are also included in OPTIMAL_MODES (lines 720-728), which is intentional—a resource can be healthy (functional) while having low capacity that requires alerting.


798-802: LGTM!

The is_low_capacity field is correctly set based on pool mode. This integrates properly with the API schema changes and the Prometheus metric reporting in the other files.

Comment on lines +297 to +303
}, {
type: 'Gauge',
name: 'backing_store_low_capacity',
configuration: {
help: 'Does backing store in has low capacity',
labelNames: ['backing_store_name']
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammatical error in help text.

The help text contains a grammatical error: "Does backing store in has low capacity" should be corrected to something clearer.

Apply this diff to improve the help text:

     }, {
         type: 'Gauge',
         name: 'backing_store_low_capacity',
         configuration: {
-            help: 'Does backing store in has low capacity',
+            help: 'Backing Store Low Capacity Status (0=normal, 1=low capacity)',
             labelNames: ['backing_store_name']
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
}, {
type: 'Gauge',
name: 'backing_store_low_capacity',
configuration: {
help: 'Does backing store in has low capacity',
labelNames: ['backing_store_name']
}
}, {
type: 'Gauge',
name: 'backing_store_low_capacity',
configuration: {
help: 'Backing Store Low Capacity Status (0=normal, 1=low capacity)',
labelNames: ['backing_store_name']
}
🤖 Prompt for AI Agents
In src/server/analytic_services/prometheus_reports/noobaa_core_report.js around
lines 297 to 303, the Gauge metric's help text is grammatically incorrect ("Does
backing store in has low capacity"); update the configuration.help string to a
clear, correct phrase such as "Indicates whether the backing store has low
capacity" (or "Backing store has low capacity") so the metric description reads
properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants