feat(api-service): Organization switch flag#9981
Conversation
- Add new feature flag IS_ORG_KILLSWITCH_FLAG_ENABLED to FeatureFlagsKeysEnum - Add kill switch check to events controller (trigger, triggerBulk, broadcastEventToAll endpoints) - Add kill switch check to subscriber-process.worker.ts - Add kill switch check to workflow.worker.ts - Add kill switch check to standard.worker.ts When kill switch is enabled for an organization: - API trigger endpoints return 503 ServiceUnavailableException - Worker processors skip job processing and return immediately Co-authored-by: Dima Grossman <dima@grossman.io>
|
Cursor Agent can help with this pull request. Just |
✅ Deploy Preview for dashboard-v2-novu-staging canceled.
|
|
Hey there and thank you for opening this pull request! 👋 We require pull request titles to follow specific formatting rules and it looks like your proposed title needs to be adjusted. Your PR title is: Requirements:
Expected format: Details: PR title must end with 'fixes TICKET-ID' (e.g., 'fixes NOV-123') or include ticket ID in branch name |
Co-authored-by: Dima Grossman <dima@grossman.io>
…rait - Add kill switch check to apikey.strategy.ts (component: 'api') - Add component trait to events controller (component: 'trigger') - Add component trait to worker processors (component: 'worker') This enables granular control of the kill switch at different levels: - 'api' - blocks all API requests at authentication level - 'trigger' - blocks trigger endpoints specifically - 'worker' - blocks worker job processing Co-authored-by: Dima Grossman <dima@grossman.io>
Co-authored-by: Dima Grossman <dima@grossman.io>
WalkthroughAdded 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/worker/src/app/workflow/services/standard.worker.ts`:
- Around line 98-116: The kill-switch lookup in getWorkerProcessor currently
calls isKillSwitchEnabled with data._organizationId/_environmentId which can be
missing for legacy jobs; modify getWorkerProcessor to first call
extractMinimalJobData (or reuse its fallback logic) to derive organizationId and
environmentId from payload.message when needed, then pass those resolved IDs
into isKillSwitchEnabled (or alter isKillSwitchEnabled to accept resolved IDs)
so the kill switch always checks the correct org/env for legacy payloads; refer
to the getWorkerProcessor, isKillSwitchEnabled, and extractMinimalJobData
symbols when making the change.
| private async isKillSwitchEnabled(data: IStandardDataDto): Promise<boolean> { | ||
| return this.featureFlagsService.getFlag({ | ||
| key: FeatureFlagsKeysEnum.IS_ORG_KILLSWITCH_FLAG_ENABLED, | ||
| defaultValue: false, | ||
| organization: { _id: data._organizationId }, | ||
| environment: { _id: data._environmentId }, | ||
| component: 'worker', | ||
| }); | ||
| } | ||
|
|
||
| private getWorkerProcessor() { | ||
| return async ({ data }: { data: IStandardDataDto }) => { | ||
| const isKillSwitchEnabled = await this.isKillSwitchEnabled(data); | ||
|
|
||
| if (isKillSwitchEnabled) { | ||
| Logger.log(`Kill switch enabled for organizationId ${data._organizationId}. Skipping job.`, LOG_CONTEXT); | ||
|
|
||
| return; | ||
| } |
There was a problem hiding this comment.
Kill-switch lookup can miss org/env for legacy job payloads.
isKillSwitchEnabled uses _organizationId/_environmentId directly, but extractMinimalJobData already handles missing IDs via payload.message. Since the kill-switch check runs before that extraction, jobs lacking those fields may bypass the kill switch or error unexpectedly.
🔧 Minimal fix to reuse the fallback extraction for kill-switch IDs
- private async isKillSwitchEnabled(data: IStandardDataDto): Promise<boolean> {
- return this.featureFlagsService.getFlag({
- key: FeatureFlagsKeysEnum.IS_ORG_KILLSWITCH_FLAG_ENABLED,
- defaultValue: false,
- organization: { _id: data._organizationId },
- environment: { _id: data._environmentId },
- component: 'worker',
- });
- }
+ private async isKillSwitchEnabled(data: IStandardDataDto): Promise<boolean> {
+ const minimalJobData = this.extractMinimalJobData(data);
+
+ return this.featureFlagsService.getFlag({
+ key: FeatureFlagsKeysEnum.IS_ORG_KILLSWITCH_FLAG_ENABLED,
+ defaultValue: false,
+ organization: { _id: minimalJobData.organizationId },
+ environment: { _id: minimalJobData.environmentId },
+ component: 'worker',
+ });
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private async isKillSwitchEnabled(data: IStandardDataDto): Promise<boolean> { | |
| return this.featureFlagsService.getFlag({ | |
| key: FeatureFlagsKeysEnum.IS_ORG_KILLSWITCH_FLAG_ENABLED, | |
| defaultValue: false, | |
| organization: { _id: data._organizationId }, | |
| environment: { _id: data._environmentId }, | |
| component: 'worker', | |
| }); | |
| } | |
| private getWorkerProcessor() { | |
| return async ({ data }: { data: IStandardDataDto }) => { | |
| const isKillSwitchEnabled = await this.isKillSwitchEnabled(data); | |
| if (isKillSwitchEnabled) { | |
| Logger.log(`Kill switch enabled for organizationId ${data._organizationId}. Skipping job.`, LOG_CONTEXT); | |
| return; | |
| } | |
| private async isKillSwitchEnabled(data: IStandardDataDto): Promise<boolean> { | |
| const minimalJobData = this.extractMinimalJobData(data); | |
| return this.featureFlagsService.getFlag({ | |
| key: FeatureFlagsKeysEnum.IS_ORG_KILLSWITCH_FLAG_ENABLED, | |
| defaultValue: false, | |
| organization: { _id: minimalJobData.organizationId }, | |
| environment: { _id: minimalJobData.environmentId }, | |
| component: 'worker', | |
| }); | |
| } | |
| private getWorkerProcessor() { | |
| return async ({ data }: { data: IStandardDataDto }) => { | |
| const isKillSwitchEnabled = await this.isKillSwitchEnabled(data); | |
| if (isKillSwitchEnabled) { | |
| Logger.log(`Kill switch enabled for organizationId ${data._organizationId}. Skipping job.`, LOG_CONTEXT); | |
| return; | |
| } |
🤖 Prompt for AI Agents
In `@apps/worker/src/app/workflow/services/standard.worker.ts` around lines 98 -
116, The kill-switch lookup in getWorkerProcessor currently calls
isKillSwitchEnabled with data._organizationId/_environmentId which can be
missing for legacy jobs; modify getWorkerProcessor to first call
extractMinimalJobData (or reuse its fallback logic) to derive organizationId and
environmentId from payload.message when needed, then pass those resolved IDs
into isKillSwitchEnabled (or alter isKillSwitchEnabled to accept resolved IDs)
so the kill switch always checks the correct org/env for legacy payloads; refer
to the getWorkerProcessor, isKillSwitchEnabled, and extractMinimalJobData
symbols when making the change.
Co-authored-by: Dima Grossman <dima@grossman.io>
What changed? Why was the change needed?
Implemented a new organization-level kill switch feature flag,
IS_ORG_KILLSWITCH_FLAG_ENABLED.This flag allows for the immediate halting of all trigger API requests and worker job processing (subscriber, workflow, and standard workers) for a specific organization and environment.
When enabled:
/trigger,/trigger/bulk,/trigger/broadcast) will return a503 Service Unavailableresponse.This feature provides a critical operational tool to manage system load or respond to incidents by temporarily disabling service for affected organizations.
Screenshots
N/A
Slack Thread