Skip to content

jobs#index perf Issue: ORDER BY uses wrong column for existing index #1675

@ScotterC

Description

@ScotterC

Summary

The GoodJob admin interface (jobs#index) performs ORDER BY created_at DESC, id DESC but the database has an index on (active_job_id, created_at). This causes full table scans on large datasets instead of utilizing the existing compound index.

Problem Description

When the good_jobs table grows to 1M+ records, the admin interface can become unusable due to long query times and statement timeouts (if set).

Root Cause: Index mismatch between query pattern and available indexes

  • Query uses: ORDER BY created_at DESC, id DESC
  • Index available: index_good_jobs_on_active_job_id_and_created_at (active_job_id, created_at)
  • Result: PostgreSQL performs expensive full table scan + sort instead of index scan

Current Behavior

-- Current query from display_all scope
SELECT * FROM good_jobs 
ORDER BY created_at DESC, id DESC 
LIMIT 25;

-- Execution plan shows:
-- Seq Scan on good_jobs

Expected Behavior

Since id == active_job_id in GoodJob (they're set to the same UUID value), the query should utilize the existing index:

-- Optimized query that uses existing index
SELECT * FROM good_jobs 
ORDER BY created_at DESC, active_job_id DESC 
LIMIT 25;

-- Would use: index_good_jobs_on_active_job_id_and_created_at

Environment

  • GoodJob Version: 4.11.2
  • Rails Version: 8.0.2.1
  • Database: PostgreSQL 16
  • Dataset Size: 1M+ jobs

Sanity Check / Reproduction Steps

GoodJob::Job.display_all(ordered_by: %w[created_at desc]).to_sql

Current Workaround

I've implemented a monkey patch that overrides the display_all method to use active_job_id instead of id in ORDER BY clauses:

# Temporary, overbuilt by Claude, workaround
module GoodJob::FilterableOptimizations
  extend ActiveSupport::Concern

  class_methods do
    def display_all(ordered_by: %w[created_at desc], after_at: nil, after_id: nil)
      if self == GoodJob::Job && ordered_by == %w[created_at desc]
        # Use active_job_id to match existing index
        query = all.order(created_at: :desc, active_job_id: :desc)
        
        # Handle pagination with active_job_id consistently
        if after_at.present? && after_id.present?
          query = query.where Arel::Nodes::Grouping.new([
            arel_table[:created_at], arel_table[:active_job_id]
          ]).lt(Arel::Nodes::Grouping.new([
            bind_value(:created_at, after_at, ActiveRecord::Type::DateTime),
            bind_value(:active_job_id, after_id, ActiveRecord::ConnectionAdapters::PostgreSQL::OID::Uuid)
          ]))
        end
        
        query
      else
        super
      end
    end
  end
end

GoodJob::Job.prepend(GoodJob::FilterableOptimizations)

Results: 20+ second queries (in my environment) reduced to ~20ms.

Proposed Solution

We could add yet another index of id and created_at but I don't think the intent is to use id primarily on the Job model.

Instead we can modify the display_all scope in app/models/concerns/good_job/filterable.rb to use active_job_id instead of id for the secondary sort column when active_job_id column exists to use the existing index.

Alternative

Add primary_key definition to GoodJob::Job that points to active_job_id. This could work but may have risky implications for advisory locks and associations.

Additional Context

The (active_job_id, created_at) index appears to be designed for this exact use case, but the Filterable concern's display_all method doesn't utilize it. This suggests the intent was to optimize these queries, but the implementation doesn't align with the indexing strategy.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions