Skip to content

Conversation

@aayushchouhan09
Copy link
Member

@aayushchouhan09 aayushchouhan09 commented Dec 16, 2025

Explain the changes

  1. Make NooBaaBucketLowCapacityState and NooBaaBucketNoCapacityState alert thresholds configurable via NooBaa CR.
  2. noobaa_types.go: Added AlertThresholdsSpec with BucketLowCapacityPercent and BucketNoCapacityPercent
  3. phase4_configuring.go : Added setDesiredPrometheusRule() to override alert thresholds in PrometheusRule when specified in CR.
  4. Default thresholds (80% and 95%) are used when not specified in the noobaa CR.

Issues: Fixed #xxx / Gap #xxx

  1. JIRA: https://issues.redhat.com/browse/RHSTOR-7492

Testing Instructions:

  1. Update Noobaa CR to change the values of BucketLowCapacityPercent and BucketNoCapacityPercent in AlertThresholdsSpec which configure the thershold values for the alerts.
  2. Create obc/bucket and try to fill the bucket volume and try to pass the threshold values set above somehow (I did some code changes in noobaa-core to achieve it)
  3. Now, check if the Alert is triggered in prometheus dashboard (after 5min.):
    a. helm install prometheus prometheus-community/kube-prometheus-stack
    b. put the label release: prometheus in the serviceMonitors and prometheusRules.
    c. kubectl port-forward --namespace='default' prometheus-prometheus-kube-prometheus-prometheus-0 9090
    d. open prometheus dashboard on the browser : http://localhost:9090 , and check for alerts
  • Doc added/updated
  • Tests added

Summary by CodeRabbit

Release Notes

  • New Features
    • Added configurable bucket capacity alert thresholds. Users can now customize the percentage levels at which low capacity and no capacity warnings are triggered for NooBaa buckets.

✏️ Tip: You can customize this high-level summary in your review settings.

…ityState and NoobaaBucketNoCapacityState

Signed-off-by: Aayush Chouhan <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Dec 16, 2025

Walkthrough

Adds configurable Prometheus alert thresholds to the NooBaa CRD. Introduces AlertThresholds spec with BucketLowCapacityPercent and BucketNoCapacityPercent fields. Updates reconciliation logic to dynamically compute PrometheusRule expressions based on configured threshold percentages.

Changes

Cohort / File(s) Summary
CRD and Type Definitions
deploy/crds/noobaa.io_noobaas.yaml, pkg/apis/noobaa/v1alpha1/noobaa_types.go
Added optional alertThresholds field to NooBaa spec with BucketLowCapacityPercent (default 80) and BucketNoCapacityPercent (default 95), both integers between 0-100. Introduced new AlertThresholdsSpec type with kubebuilder validation constraints.
Generated Code
pkg/apis/noobaa/v1alpha1/zz_generated.deepcopy.go
Added DeepCopy and DeepCopyInto methods for AlertThresholdsSpec, and extended NooBaaSpec DeepCopyInto to handle the new AlertThresholds field.
Bundle and Assets
pkg/bundle/deploy.go
Updated embedded CRD YAML content and SHA256 hash to reflect the new alertThresholds schema definition.
Reconciliation Logic
pkg/system/phase4_configuring.go
Added setDesiredPrometheusRule() helper method to compute PrometheusRule alert expressions dynamically based on AlertThresholds values. ReconcilePrometheusRule now uses this callback for state updates instead of nil.

Sequence Diagram

sequenceDiagram
    participant User as Operator
    participant NooBaa as NooBaa Spec
    participant Reconciler
    participant setDesiredPrometheusRule as setDesiredPrometheusRule()
    participant PrometheusRule

    User->>NooBaa: Configure alertThresholds<br/>(bucketLowCapacityPercent,<br/>bucketNoCapacityPercent)
    Note over NooBaa: AlertThresholds spec persisted
    
    Reconciler->>Reconciler: ReconcilePrometheusRule()
    Reconciler->>setDesiredPrometheusRule: Invoke callback
    
    setDesiredPrometheusRule->>NooBaa: Read AlertThresholds
    alt AlertThresholds is set
        Note over setDesiredPrometheusRule: Generate PromQL expressions<br/>from percentages
        setDesiredPrometheusRule->>PrometheusRule: Update alert rules with<br/>computed expressions
        PrometheusRule-->>setDesiredPrometheusRule: Acknowledged
    else AlertThresholds is nil
        Note over setDesiredPrometheusRule: No-op
    end
    
    setDesiredPrometheusRule-->>Reconciler: Return (status)
    Reconciler-->>User: Prometheus alerts configured
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Alert expression logic in setDesiredPrometheusRule(): Verify correct PromQL generation for both bucket capacity thresholds and boundary condition handling
  • Validation constraints (0-100 range): Confirm kubebuilder annotations are properly applied to both integer fields
  • Deep copy implementation: Ensure pointer field handling is correct for optional AlertThresholds nested in NooBaaSpec
  • Default values: Verify CRD defaults (80% and 95%) align with intended alerting behavior

Suggested labels

size/M

Suggested reviewers

  • jackyalbo
  • liranmauda
  • naveenpaul1

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: making alert thresholds for NooBaa configurable, which aligns with all file changes.
Description check ✅ Passed The description covers required sections with clear explanations, issue reference, and detailed testing instructions, though doc and test checkboxes remain unchecked.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15459f4 and 582bdf5.

📒 Files selected for processing (2)
  • pkg/bundle/deploy.go
  • pkg/system/phase4_configuring.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/bundle/deploy.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: cnpg-deployment-test
  • GitHub Check: run-cli-tests
  • GitHub Check: run-kms-tls-sa-test
  • GitHub Check: run-kms-dev-test
  • GitHub Check: golangci-lint
  • GitHub Check: run-core-config-map-tests
  • GitHub Check: run-kms-tls-token-test
  • GitHub Check: run-operator-tests
  • GitHub Check: run-kms-kmip-test
  • GitHub Check: run-kms-key-rotate-test
  • GitHub Check: run-azure-vault-test
  • GitHub Check: run-admission-test
  • GitHub Check: run-hac-test
🔇 Additional comments (2)
pkg/system/phase4_configuring.go (2)

1620-1628: LGTM: PrometheusRule reconciliation now supports dynamic configuration.

The change from passing nil to r.setDesiredPrometheusRule enables dynamic configuration of alert thresholds based on the NooBaa CR spec. This pattern is consistent with other reconciliation functions in the codebase.


1630-1655: No issues found. The implementation correctly uses valid Prometheus expressions, relies on CRD-level validation for threshold ranges (which is the appropriate pattern for Kubernetes resources), and properly handles unknown alert names by leaving them unchanged.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/bundle/deploy.go (1)

4885-4905: NooBaaBucketNoCapacityState alert may not fire for threshold=100 and text still implies “100%”

The updated rules that use scalar thresholds are generally good, but there are two subtle behavior issues:

  • The expression now uses > scalar(NooBaa_bucket_no_capacity_threshold). If the threshold metric is a 0–100 percentage and a user sets it to 100, the alert will never trigger (usage cannot exceed 100). Previously a >= 100 style check would still fire at full capacity.
  • The alert annotations still say “is using all of its capacity” / “No Capacity State”, but with a configurable threshold (default 95) this is effectively “near full capacity”. Either the expression or the text should be adjusted so they match.

Consider changing the “no capacity” expression to >= scalar(NooBaa_bucket_no_capacity_threshold) (and possibly capping the CRD max at 100) and/or rewording the annotation to “usage is above the configured no-capacity threshold”.

🧹 Nitpick comments (4)
deploy/crds/noobaa.io_noobaas.yaml (1)

989-1012: CRD schema for spec.alertThresholds matches the Go API

alertThresholds and its bucketLowCapacityPercent / bucketNoCapacityPercent fields are correctly modelled as int32 with 0–100 constraints, aligned with AlertThresholdsSpec in noobaa_types.go. The “default is 80/95” note here is purely documentation, which is fine given the actual defaulting is handled in core.

If you ever want API-level defaulting (values materialized in the CR even when omitted), you could add +kubebuilder:default tags on the Go fields so controller-gen emits default: entries in this schema, but that would then need to stay in sync with the core’s hardcoded defaults.

pkg/system/phase2_creating.go (1)

537-544: Avoid stale BUCKET_*_THRESHOLD env when AlertThresholds are unset

Right now the env vars are only updated when the pointer fields are non‑nil. If a user removes spec.alertThresholds (or one of its fields) after previously setting it, the corresponding BUCKET_*_THRESHOLD value in the pod spec will remain at the last configured value instead of reverting to the core default.

To keep pod env in sync with the CR, consider explicitly clearing the value when the field is unset, e.g.:

-    case "BUCKET_LOW_CAPACITY_THRESHOLD":
-        if r.NooBaa.Spec.AlertThresholds != nil && r.NooBaa.Spec.AlertThresholds.BucketLowCapacityPercent != nil {
-            c.Env[j].Value = fmt.Sprintf("%d", *r.NooBaa.Spec.AlertThresholds.BucketLowCapacityPercent)
-        }
-    case "BUCKET_NO_CAPACITY_THRESHOLD":
-        if r.NooBaa.Spec.AlertThresholds != nil && r.NooBaa.Spec.AlertThresholds.BucketNoCapacityPercent != nil {
-            c.Env[j].Value = fmt.Sprintf("%d", *r.NooBaa.Spec.AlertThresholds.BucketNoCapacityPercent)
-        }
+    case "BUCKET_LOW_CAPACITY_THRESHOLD":
+        if at := r.NooBaa.Spec.AlertThresholds; at != nil && at.BucketLowCapacityPercent != nil {
+            c.Env[j].Value = fmt.Sprintf("%d", *at.BucketLowCapacityPercent)
+        } else {
+            c.Env[j].Value = ""
+        }
+    case "BUCKET_NO_CAPACITY_THRESHOLD":
+        if at := r.NooBaa.Spec.AlertThresholds; at != nil && at.BucketNoCapacityPercent != nil {
+            c.Env[j].Value = fmt.Sprintf("%d", *at.BucketNoCapacityPercent)
+        } else {
+            c.Env[j].Value = ""
+        }

This keeps the env vars consistent with the CR when users remove configuration.

pkg/apis/noobaa/v1alpha1/noobaa_types.go (1)

245-248: AlertThresholdsSpec API is well-shaped; defaults are documented but not API-enforced

The AlertThresholds pointer on NooBaaSpec and the AlertThresholdsSpec definition (pointer int32 fields with 0–100 validation) look clean and align with the CRD schema. Using pointers makes the “unset vs set to 0” distinction unambiguous.

Right now the “default is 80/95” behavior is only in comments (and implemented in core), which is perfectly valid. If you want those defaults to be visible directly in the CR when omitted, you could add:

// +kubebuilder:default=80
BucketLowCapacityPercent *int32 `json:"bucketLowCapacityPercent,omitempty"`
// +kubebuilder:default=95
BucketNoCapacityPercent *int32 `json:"bucketNoCapacityPercent,omitempty"`

and regenerate, but that’s optional and would need to stay consistent with the core implementation.

Also applies to: 314-331

pkg/bundle/deploy.go (1)

2418-2441: CRD alertThresholds schema is solid; consider explicit defaults/constraints

The new spec.alertThresholds object and its integer fields look well-placed and correctly constrained (int32, minimum: 0, maximum: 100). Two optional refinements:

  • The descriptions mention defaults (80 / 95) but the OpenAPI schema does not set default: values. If the defaults are only enforced in controller code, consider either adding default here for discoverability or dropping the explicit numbers from the description to avoid confusing API consumers.
  • Today both thresholds can be configured to any 0–100 value independently. If you expect bucketNoCapacityPercent to always be ≥ bucketLowCapacityPercent, that invariant will need to be enforced in validation logic (can’t be expressed in this schema alone).
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40ea3c0 and 15459f4.

📒 Files selected for processing (7)
  • deploy/crds/noobaa.io_noobaas.yaml (1 hunks)
  • deploy/internal/prometheus-rules.yaml (2 hunks)
  • deploy/internal/statefulset-core.yaml (1 hunks)
  • pkg/apis/noobaa/v1alpha1/noobaa_types.go (2 hunks)
  • pkg/apis/noobaa/v1alpha1/zz_generated.deepcopy.go (2 hunks)
  • pkg/bundle/deploy.go (7 hunks)
  • pkg/system/phase2_creating.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: naveenpaul1
Repo: noobaa/noobaa-operator PR: 1607
File: pkg/bundle/deploy.go:0-0
Timestamp: 2025-07-02T04:34:47.006Z
Learning: Default values for `METRICS_AUTH_ENABLED` and `VERSION_AUTH_ENABLED` are set up under `noobaa-config` in the NooBaa operator.
🧬 Code graph analysis (2)
pkg/apis/noobaa/v1alpha1/zz_generated.deepcopy.go (1)
pkg/apis/noobaa/v1alpha1/noobaa_types.go (1)
  • AlertThresholdsSpec (315-331)
pkg/system/phase2_creating.go (1)
pkg/apis/noobaa/v1alpha1/noobaa_types.go (1)
  • NooBaa (41-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: run-admission-test
  • GitHub Check: run-cli-tests
  • GitHub Check: run-azure-vault-test
  • GitHub Check: run-kms-dev-test
  • GitHub Check: run-hac-test
  • GitHub Check: run-operator-tests
  • GitHub Check: run-kms-key-rotate-test
  • GitHub Check: run-kms-tls-sa-test
  • GitHub Check: run-core-config-map-tests
  • GitHub Check: run-kms-kmip-test
  • GitHub Check: cnpg-deployment-test
  • GitHub Check: run-kms-tls-token-test
  • GitHub Check: golangci-lint
🔇 Additional comments (5)
deploy/internal/statefulset-core.yaml (1)

136-137: Env var stubs for bucket capacity thresholds look consistent

Adding BUCKET_LOW_CAPACITY_THRESHOLD and BUCKET_NO_CAPACITY_THRESHOLD here matches the existing pattern of defining core env vars in the template and populating values from the reconciler. No issues from the manifest side.

pkg/apis/noobaa/v1alpha1/zz_generated.deepcopy.go (2)

114-138: DeepCopy for AlertThresholdsSpec correctly handles pointer fields

The new AlertThresholdsSpec deepcopy functions mirror the standard pattern in this file and correctly allocate and copy the pointer fields.


1494-1498: NooBaaSpec now deep-copies AlertThresholds as expected

Wiring AlertThresholds into NooBaaSpec.DeepCopyInto via (*in).DeepCopyInto(*out) is correct and keeps the spec fully copy-safe.

deploy/internal/prometheus-rules.yaml (1)

169-181: Dynamic bucket capacity thresholds via scalar() look good; confirm metric shape

Using scalar(NooBaa_bucket_low_capacity_threshold) / scalar(NooBaa_bucket_no_capacity_threshold) is a reasonable way to drive alerts from configuration, as long as each of those metrics exposes exactly one (unlabelled) timeseries in Prometheus. If they ever gain labels or multiple series, these rules will start erroring instead of alerting.

Please verify in a test cluster that:

  • The expressions evaluate without “multiple series” / scalar conversion errors.
  • Adjusting spec.alertThresholds in the NooBaa CR actually shifts the effective alert firing point.
pkg/bundle/deploy.go (1)

5409-5458: Core statefulset env wiring for bucket capacity thresholds looks consistent

Adding BUCKET_LOW_CAPACITY_THRESHOLD and BUCKET_NO_CAPACITY_THRESHOLD alongside the other core env vars is consistent with the existing pattern (placeholders that core reads/configures at runtime). No issues from the manifest side.

@alphaprinz
Copy link
Contributor

Conflicts with #1747? @nimrod-becker

@pull-request-size pull-request-size bot added size/L and removed size/M labels Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants