Skip to content

Releases: crmne/ruby_llm

1.11.0

16 Jan 17:47

Choose a tag to compare

RubyLLM 1.11: xAI Provider & Grok Models 🚀🤖⚡

This release welcomes xAI as a first-class provider, brings Grok models into the registry, and polishes docs around configuration and thinking. Plug in your xAI API key and start chatting with Grok in seconds.

🚀 xAI Provider (Hello, Grok!)

Use xAI’s OpenAI-compatible API via a dedicated provider and jump straight into chat:

RubyLLM.configure do |config|
  config.xai_api_key = ENV["XAI_API_KEY"]
end

chat = RubyLLM.chat(model: "grok-4-fast-non-reasoning")
response = chat.ask("What's the fastest way to parse a CSV in Ruby?")
response.content
  • xAI is now a first-class provider (:xai) with OpenAI-compatible endpoints under the hood.
  • Grok models are included in the registry so you can pick by name without extra wiring.
  • Streaming, tool calls, and structured output work the same way you already use with OpenAI-compatible providers.

Stream responses just like you’re used to:

chat = RubyLLM.chat(model: "grok-3-mini")

chat.ask("Summarize this PR in 3 bullets") do |chunk|
  print chunk.content
end

🧩 Model Registry Refresh

Model metadata and the public models list were refreshed to include Grok models and related updates.

📚 Docs Polishes

  • Configuration docs now include xAI setup examples.
  • The thinking guide got a tighter flow and clearer examples.

🛠️ Provider Fixes

  • Resolved an OpenAI, Bedrock, and Anthropic error introduced by the new URI interface.

Installation

gem "ruby_llm", "1.11.0"

Upgrading from 1.10.x

bundle update ruby_llm

Merged PRs

Full Changelog: 1.10.0...1.11.0

1.10.0

13 Jan 19:03

Choose a tag to compare

RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes 🧠✨🚆

This release brings first-class extended thinking across providers, full Gemini 3 Pro/Flash thinking-signature support (chat + tools), a Rails upgrade path to persist it, and a tighter streaming pipeline. Plus official Ruby 4.0 support, safer model registry refreshes, a Vertex AI global endpoint fix, and a docs refresh.

🧠 Extended Thinking Everywhere

Tune reasoning depth and budget across providers with with_thinking, and get thinking output back when available:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :high, budget: 8000)

response = chat.ask("Prove it with numbers.")
response.thinking&.text
response.thinking&.signature
response.thinking_tokens
  • response.thinking and chunk.thinking expose thinking content during normal and streaming requests.
  • response.thinking_tokens and response.tokens.thinking track thinking token usage when providers report it.
  • Gemini 3 Pro/Flash fully support thought signatures across chat and tool calls, so multi-step sessions stay consistent.
  • Extended thinking quirks are now normalized across providers so you can tune one API and get predictable output.

Stream thinking and answer content side-by-side:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :medium)

chat.ask("Solve this step by step: What is 127 * 43?") do |chunk|
  print chunk.thinking&.text
  print chunk.content
end
  • Streaming stays backward-compatible: existing apps can keep printing chunk.content, while richer UIs can also render chunk.thinking.

🧰 Rails + ActiveRecord Persistence

Thinking output can now be stored alongside messages (text, signature, and token usage), with an upgrade generator for existing apps:

rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate
  • Adds thinking_text, thinking_signature, and thinking_tokens to message tables.
  • Adds thought_signature to tool calls for Gemini tool calling.
  • Fixes a Rails streaming issue where the first tokens could be dropped.

📊 Unified Token Tracking

All token counts now live in response.tokens and message.tokens, including input, output, cached, cache creation, and thinking tokens.

✅ Official Ruby 4.0 Support

Ruby 4.0 is now officially supported in CI and dependencies.

🧩 Model Registry Updates

  • Refreshing the registry no longer deletes models from providers you haven't configured.

🌍 Vertex AI Global Endpoint Fix

When vertexai_location is global, the API base now correctly resolves to:

https://aiplatform.googleapis.com/v1beta1

📚 Docs Updates

  • New extended thinking guide.
  • Token usage docs include thinking tokens.

Installation

gem "ruby_llm", "1.10.0"

Upgrading from 1.9.x

bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate

Merged PRs

New Contributors

Full Changelog: 1.9.2...1.10.0

1.9.2

09 Jan 17:51

Choose a tag to compare

RubyLLM 1.9.2: Models.dev Registry, Imagen 4.0, and Setup Niceties 🧭🖼️🧪

A small patch that swaps the model registry backend, updates Imagen defaults, and smooths out migrations and local setup—plus a couple of linting/test adjustments.

🧭 Models Registry Moves to Models.dev

This is a big deal for RubyLLM. Switching the registry to Models.dev fixes every issue we've had with model metadata: pricing is consistent, capabilities are consistent, and everything is backed by open data. Huge thanks to the Models.dev team for the work.

No shade on Parsera, they’re just busy.

• Fixes #545, #459, #559, #561, #560
• Updated links to Models.dev’s tracker for consistency

🖼️ Imagen 4.0 Becomes the Default

Image generation specs and model lists now target Imagen 4.0, so new images use the latest default without manual overrides.

🧬 Migration Typo Fix for 1.7 Naming

Corrects a class-naming typo in the 1.7 migration so older upgrades remain reliable.

• Fixes #498

Installation

gem "ruby_llm", "1.9.2"

Upgrading from 1.9.1

bundle update ruby_llm

Merged PRs

Full Changelog: 1.9.1...1.9.2

1.9.1

05 Nov 09:36

Choose a tag to compare

RubyLLM 1.9.1: Rails Namespaces, Vertex AI Auth & Anthropic Uploads 🛤️🔐

A focused patch release that keeps ActiveRecord integrations humming in namespaced apps, restores Anthropic uploads, and hardens Vertex AI auth while smoothing out MySQL migrations and attachment handling.

🧭 Namespaced ActiveRecord Chats Keep Their Foreign Keys

acts_as_* helpers now accept explicit foreign key overrides, matching Rails’ association API, so namespaced chats and tool calls work even when their FK column doesn’t follow the default pattern.

class Support::Message < ActiveRecord::Base
  acts_as_message chat_class: "Support::Conversation",
                  chat_foreign_key: "conversation_id",
                  tool_call_class: "Support::ToolCall",
                  tool_calls_foreign_key: "support_tool_call_id"
end
  • Fixes the regression introduced in 1.8.x for engines and modularized apps.
  • Generators include the right foreign keys out of the box when you opt into namespaces.

🚆 Zeitwerk Eager Loading Behaves Outside Rails

The Railtie is now safely ignored during eager loading when Rails::Railtie isn’t defined, preventing NameError crashes in gems and background workers that depend on RubyLLM but don’t run inside Rails.

📎 Attachments Understand Uploaded Files

RubyLLM::Attachment detects ActionDispatch::Http::UploadedFile, Pathname, and Active Storage blobs automatically, preserving filenames and content types so file flows stay intact across providers.

attachment = RubyLLM::Attachment.new(params[:file])
attachment.filename # => original upload name
  • Normalizes MIME detection and rewinds IO sources for consistent encoding.
  • Anthropic uploads once again serialize local PDFs/images correctly after the 1.9.0 regression fix.

🔐 Vertex AI Auth Uses googleauth the Right Way

Swapped fetch_access_token! for apply when building Vertex AI headers, keeping compatibility with recent googleauth releases and avoiding token caching errors.

🗄️ MySQL JSON Columns Skip Illegal Defaults

Rails generators detect MySQL adapters and omit default values for JSON columns, eliminating migration errors on Aurora/MySQL deployments.

🧪 Reliability Tweaks

Fresh specs cover RubyLLM::Utils coercion helpers so nested hashes and key transforms stay stable as we expand provider support.

Installation

gem "ruby_llm", "1.9.1"

Upgrading from 1.9.0

bundle update ruby_llm

Merged PRs

New Contributors

Full Changelog: 1.9.0...1.9.1

1.9.0

03 Nov 15:30
be250ce

Choose a tag to compare

RubyLLM 1.9.0: Tool Schemas, Prompt Caching & Transcriptions ✨🎙️

Major release that makes tool definitions feel like Ruby, lets you lean on Anthropic prompt caching everywhere, and turns audio transcription into a one-liner—plus better Gemini structured output and Nano Banana image responses.

🧰 JSON Schema Tooling That Feels Native

The new RubyLLM::Schema params DSL supports full JSON Schema for tool parameter definitions, including nested objects, arrays, enums, and nullable fields.

class Scheduler < RubyLLM::Tool
  description "Books a meeting"

  params do
    object :window, description: "Time window to reserve" do
      string :start, description: "ISO8601 start"
      string :finish, description: "ISO8601 finish"
    end

    array :participants, of: :string, description: "Email invitees"

    any_of :format, description: "Optional meeting format" do
      string enum: %w[virtual in_person]
      null
    end
  end

  def execute(window:, participants:, format: nil)
    Booking.reserve(window:, participants:, format:)
  end
end
  • Powered by RubyLLM::Schema, the same awesome Ruby DSL we recommend to use for Structured Output's chat.with_schema.
  • Already handles Anthropic/Gemini quirks like nullable unions and enums - no more ad-hoc translation layers.
  • Prefer raw hashes? Pass params schema: { ... } to keep your existing JSON Schema verbatim.

🧱 Raw Content Blocks & Anthropic Prompt Caching Everywhere

When you need to handcraft message envelopes:

chat = RubyLLM.chat(model: "claude-sonnet-4-5")
raw_request = RubyLLM::Content::Raw.new([
  { type: "text", text: File.read("prompt.md"), cache_control: { type: "ephemeral" } },
  { type: "text", text: "Summarize today’s work." }
])

chat.ask(raw_request)

We also provide an helper specifically for Anthropic Prompt Caching:

system_block = RubyLLM::Providers::Anthropic::Content.new(
  "You are our release-notes assistant.",
  cache: true
)

chat.add_message(role: :system, content: system_block)
  • RubyLLM::Content::Raw lets you ship provider-native payloads for content blocks.
  • Anthropic helpers keep cache_control hints readable while still producing the right JSON structure.
  • Every RubyLLM::Message now exposes cached_tokens and cache_creation_tokens, so you can see exactly what the provider pulled from cache versus what it had to recreate.

Please run rails generate ruby_llm:upgrade_to_v1_9 in your Rails app if you come from 1.8.x.

⚙️ Tool.with_params Plays Nice with Anthropic Caching

Similarly to Raw Content Blocks, .with_params lets you set arbitrary params in tool definitions. Perfect for Anthropic’s cache_control hints.

class ChangelogTool < RubyLLM::Tool
  description "Formats commits into release notes"

  params do
    array :commits, of: :string
  end

  with_params cache_control: { type: "ephemeral" }

  def execute(commits:)
    ReleaseNotes.format(commits)
  end
end

🎙️ RubyLLM.transcribe Turns Audio into Text (With Diarization)

One method call gives you transcripts, diarized segments, and consistent token tallies across providers.

transcription = RubyLLM.transcribe(
  "all-hands.m4a",
  model: "gpt-4o-transcribe-diarize",
  language: "en",
  prompt: "Focus on action items."
)

transcription.segments.each do |segment|
  puts "#{segment['speaker']}: #{segment['text']} (#{segment['start']}s – #{segment['end']}s)"
end
  • Supports OpenAI (whisper-1, gpt-4o-transcribe, diarization variants), Gemini 2.5 Flash, and Vertex AI with the same API.
  • Optional speaker references map diarized voices to real names.

🛠️ Gemini Structured Output Fixes & Nano Banana Inline Images

We went deep on Gemini’s edges so you don’t have to.

  • Nullables and anyOf now translate cleanly, and Gemini 2.5 finally respects responseJsonSchema, so complex structured output works out of the box.
  • Parallel tool calls return one single message with the right role. This should increase its accuracy in using and responding to tool calls.
  • Gemini 2.5 Flash Image (“Nano Banana”) surfaces inline images as actual attachments—pair it with your UI immediately.
chat = RubyLLM.chat(model: "gemini-2.5-flash-image")
reply = chat.ask("Sketch a Nano Banana wearing aviators.")
image = reply.content.attachments.first
File.binwrite("nano-banana.png", image.read)

(If you missed the backstory, my blog post Nano Banana with RubyLLM has the full walkthrough.)

🗂️ Configurable Model Registry file path

Deploying to read-only filesystems? Point RubyLLM at a writable JSON registry and keep refreshing models without hacks.

RubyLLM.models.save_to_json("/var/app/models.json")

RubyLLM.configure do |config|
  config.model_registry_file = "/var/app/models.json"
end

Just remember that RubyLLM.models.refresh! only updates the in-memory registry. To persist changes to disk, call:

RubyLLM.models.refresh!
RubyLLM.models.save_to_json
  • Plays nicely with the ActiveRecord integration (which still stores models in the DB).

Installation

gem "ruby_llm", "1.9.0"

Upgrading from 1.8.x

bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_9`

Merged PRs

New Contributors

Full Changelog: 1.8.2...1.9.0

1.8.2

24 Sep 15:34
99e9594

Choose a tag to compare

RubyLLM 1.8.2: Enhanced Tool Calling & Reliability 🔧✨

Minor release improving tool calling visualization in Rails chat UIs, fixing namespaced model support, and enhancing stability.

🔧 Tool Call Visualization

Enhanced chat UI with basic visualization of tool/function calls:

  • Tool call display: Messages now show function calls with name and arguments in JSON format
  • Improved readability: Styled with monospace font and gray background for clear distinction
  • Seamless integration: Tool calls integrate naturally into the conversation flow

🐛 Bug Fixes & Improvements

Chat UI Generator

  • Namespaced model support: Fixed chat UI generator to properly handle namespaced models (fixes #425)
  • Cleaner output: Refined model class injection for better code generation

Network Reliability

  • Faraday adapter: Explicitly set net_http adapter instead of relying on environment defaults (fixes #428)
  • Improved stability: Ensures consistent network behavior across different environments

📚 Documentation & Testing

  • CI optimization: Test suite now runs more efficiently, with coverage reports from latest Ruby/Rails only
  • Generator tests: Optimized to run only on latest Ruby and Rails versions
  • Coverage accuracy: Excluded generator code from coverage metrics for more accurate reporting

Installation

gem 'ruby_llm', '1.8.2'

Upgrading from 1.8.1

bundle update ruby_llm

All changes are backward compatible. To benefit from the tool call visualization, regenerate your chat UI with rails generate ruby_llm:chat_ui.

Merged PRs

New Contributors

Full Changelog: 1.8.1...1.8.2

1.8.1

21 Sep 15:07
46ac613

Choose a tag to compare

RubyLLM 1.8.1: Efficient Chat Streaming 🚀💬

Small release bringing production-ready streaming for Rails chat UIs and minor fixes.

💬 Production-Ready Chat Streaming

Improved the chat UI generator with efficient chunk streaming that reduces bandwidth usage:

  • Bandwidth optimized: New broadcast_append_chunk method appends individual chunks without re-transmitting entire messages
  • Single subscription: Maintains one Turbo Stream subscription at chat level (not per message)
  • Reduced overhead: Jobs now append chunks instead of updating entire messages, reducing database writes

🔧 Improvements & Fixes

  • Cleaner injection: Refined message model class injection for chat UI generator
  • GPT-5 capabilities: Fixed missing capability declarations for parsera compatibility
  • Funding support: Added funding URI to gemspec metadata
  • Documentation updates: Enhanced README and moderation guides
  • Model registry: Latest model updates across all providers
  • Dependency updates: Updated Appraisal gemfiles

Installation

gem 'ruby_llm', '1.8.1'

Upgrading from 1.8.0

bundle update ruby_llm

All changes are backward compatible. To benefit from the streaming improvements, regenerate your chat UI with rails generate ruby_llm:chat_ui.

Merged PRs

New Contributors

Full Changelog: 1.8.0...1.8.1

1.8.0

14 Sep 11:47
0cb6299

Choose a tag to compare

RubyLLM 1.8.0: Video Support & Content Moderation 🎥🛡️

Major feature release bringing video file support for multimodal models and content moderation capabilities to ensure safer AI interactions.

🎥 Video File Support

Full video file support for models with video capabilities:

# Local video files
chat = RubyLLM.chat(model: "gemini-2.5-flash")
response = chat.ask("What happens in this video?", with: "video.mp4")

# Remote video URLs (with or without extensions)
response = chat.ask("Describe this video", with: "https://example.com/video")

# Multiple attachments including video
response = chat.ask("Compare these", with: ["image.jpg", "video.mp4"])

Features:

  • Automatic MIME type detection for video formats
  • Support for remote videos without file extensions
  • Seamless integration with existing attachment system
  • Full support for Gemini and VertexAI video-capable models

🛡️ Content Moderation

New content moderation API to identify potentially harmful content before sending to LLMs:

# Basic moderation
result = RubyLLM.moderate("User input text")
puts result.flagged?  # => true/false
puts result.flagged_categories  # => ["harassment", "hate"]

# Integration pattern - screen before chat
def safe_chat(user_input)
  moderation = RubyLLM.moderate(user_input)
  return "Content not allowed" if moderation.flagged?

  RubyLLM.chat.ask(user_input)
end

# Check specific categories
result = RubyLLM.moderate("Some text")
puts result.category_scores["harassment"]  # => 0.0234
puts result.category_scores["violence"]    # => 0.0012

Features:

  • Detects sexual, hate, harassment, violence, self-harm, and other harmful content
  • Convenience methods: flagged?, flagged_categories, category_scores
  • Currently supports OpenAI's moderation API
  • Extensible architecture for future providers
  • Configurable default model (defaults to omni-moderation-latest)

🐛 Bug Fixes

Rails Inflection Issue

  • Fixed critical bug where Rails apps using Llm module/namespace would break due to inflection conflicts
  • RubyLLM now properly isolates its inflections

Migration Foreign Key Errors

  • Fixed install generator creating migrations with foreign key references to non-existent tables
  • Migrations now create tables first, then add references in correct order
  • Prevents "relation does not exist" errors in PostgreSQL and other databases

Model Registry Improvements

  • Fixed Models.resolve instance method delegation
  • Fixed helper methods to return all models supporting specific modalities
  • image_models, audio_models, and embedding_models now correctly include all capable models

📚 Documentation

  • Added comprehensive moderation guide with Rails integration examples
  • Updated video support documentation with examples
  • Clarified version requirements in documentation

Installation

gem 'ruby_llm', '1.8.0'

Upgrading from 1.7.x

bundle update ruby_llm

All changes are backward compatible. New features are opt-in.

Merged PRs

New Contributors

Full Changelog: 1.7.1...1.8.0

1.7.1

11 Sep 13:43
6ee02d0

Choose a tag to compare

RubyLLM 1.7.1: Generator Fixes & Enhanced Upgrades 🔧

Bug fixes and improvements for Rails generators, with special focus on namespaced models and automatic migration of existing apps.

🐛 Critical Fixes

Namespaced Model Support

The generators now properly handle namespaced models throughout:

# Now works correctly with namespaced models
rails g ruby_llm:install chat:LLM::Chat message:LLM::Message model:LLM::Model
rails g ruby_llm:upgrade_to_v1_7 chat:Assistant::Chat message:Assistant::Message

Fixed issues:

  • Invalid table names like :assistant/chats:assistant_chats
  • Model migration failures with namespaced models ✅
  • Foreign key migrations now handle custom table names correctly ✅
  • Namespace modules automatically created with table_name_prefix

✨ Automatic acts_as API Migration

The upgrade generator now automatically converts from the old acts_as API to the new one:

# BEFORE running upgrade generator (OLD API)
class Conversation < ApplicationRecord
  acts_as_chat message_class: 'ChatMessage', tool_call_class: 'AIToolCall'
end

class ChatMessage < ApplicationRecord
  acts_as_message chat_class: 'Conversation', chat_foreign_key: 'conversation_id'
end

# AFTER running upgrade generator (NEW API)
class Conversation < ApplicationRecord
  acts_as_chat messages: :chat_messages, message_class: 'ChatMessage', model: :model
end

class ChatMessage < ApplicationRecord
  acts_as_message chat: :conversation, chat_class: 'Conversation', tool_calls: :ai_tool_calls, tool_call_class: 'AIToolCall', model: :model
end

The generator:

  • Converts from old *_class parameters to new association-based parameters
  • Adds the new model association to all models
  • Preserves custom class names and associations
  • Handles both simple and complex/namespaced models

No manual changes needed - the generator handles the complete API migration automatically! 🎉

🏗️ Generator Architecture Improvements

DRY Generator Code

Created a shared GeneratorHelpers module to eliminate duplication between generators:

  • Shared acts_as_* declaration logic
  • Common database detection methods
  • Unified namespace handling
  • Consistent table name generation

Better Rails Conventions

  • Generators reorganized into proper subdirectories
  • Private methods moved to conventional location
  • Follows Rails generator best practices
  • Cleaner, more maintainable code

🚨 Troubleshooting Helper

Added clear troubleshooting for the most common upgrade issue:

# If you see: "undefined local variable or method 'acts_as_model'"
# Add this to config/application.rb BEFORE your Application class:

RubyLLM.configure do |config|
  config.use_new_acts_as = true
end

module YourApp
  class Application < Rails::Application
    # ...
  end
end

The upgrade generator now shows this warning proactively and documentation includes a dedicated troubleshooting section.

🔄 Migration Improvements

  • Fixed instance variable usage in migration templates
  • Better handling of existing Model tables during upgrade
  • Initializer creation if missing during upgrade
  • Simplified upgrade instructions pointing to migration guide

Installation

gem 'ruby_llm', '1.7.1'

Upgrading from 1.7.0

Just update your gem - all fixes are backward compatible:

bundle update ruby_llm

Upgrading from 1.6.x

Use the improved upgrade generator:

rails generate ruby_llm:upgrade_to_v1_7
rails db:migrate

The generator now handles everything automatically, including updating your model files!

Merged PRs

  • Fix namespaced model table names in upgrade generator by @willcosgrove in #398
  • Fix namespaced models in Model migration and foreign key migrations by @willcosgrove in #399

New Contributors

Full Changelog: 1.7.0...1.7.1

1.7.0

10 Sep 16:00
7f00d26

Choose a tag to compare

RubyLLM 1.7: Rails Revolution & Vertex AI 🚀

Major Rails integration overhaul bringing database-backed models, UI generators, and a more intuitive acts_as API. Plus Google Cloud Vertex AI support, regional AWS Bedrock, and streamlined installation!

🌟 Google Cloud Vertex AI Support

Full Vertex AI provider integration with dynamic model discovery:

# Add to your Gemfile:
gem "googleauth"  # Required for Vertex AI authentication

# Configure Vertex AI:
RubyLLM.configure do |config|
  config.vertexai_project_id = "your-project"
  config.vertexai_location = "us-central1"
end

# Access Gemini and other Google models through Vertex AI
chat = RubyLLM.chat(model: "gemini-2.5-pro", provider: :vertexai)
response = chat.ask("What can you do?")

Features:

  • Dynamic model fetching from Vertex AI API with pagination
  • Automatic discovery of Gemini foundation models
  • Metadata enrichment from Parsera API
  • Full chat and embeddings support
  • Seamless integration with existing Gemini provider
  • Uses Application Default Credentials (ADC) for authentication

🎉 New Rails-Like acts_as API

The Rails integration gets a massive upgrade with a more intuitive, Rails-like API:

# OLD way (still works, deprecated in v2.0)
class Chat < ApplicationRecord
  acts_as_chat message_class: 'Message', tool_call_class: 'ToolCall'
end

# NEW way - use association names as primary parameters!
class Chat < ApplicationRecord
  acts_as_chat messages: :messages, model: :model
end

class Message < ApplicationRecord
  acts_as_message chat: :chat, tool_calls: :tool_calls, model: :model
end

Two-Command Upgrade

Existing apps can upgrade seamlessly:

# Step 1: Run the upgrade generator
rails generate ruby_llm:upgrade_to_v1_7

# Step 2: Run migrations
rails db:migrate

That's it! The upgrade generator:

  • Creates the models table if needed
  • Automatically adds config.use_new_acts_as = true to your initializer
  • Migrates your existing data to use foreign keys
  • Preserves all your data (old string columns renamed to model_id_string)

🖥️ Complete Chat UI Generator

Build a full chat interface with one command:

# Generate complete chat UI with Turbo streaming
rails generate ruby_llm:chat_ui

This creates:

  • Controllers: Chat and message controllers with Rails best practices
  • Views: Clean HTML views for chat list, creation, and messaging
  • Models page: Browse available AI models
  • Turbo Streams: Real-time message updates
  • Background job: Streaming AI responses
  • Model selector: Choose models in chat creation

The UI is intentionally simple and clean - perfect for customization!

💾 Database-Backed Model Registry

Models are now first-class ActiveRecord objects with rich metadata:

# Chat.create! has the same interface as RubyLLM.chat (PORO)
chat = Chat.create!  # Uses default model from config
chat = Chat.create!(model: "gpt-4o-mini")  # Specify model
chat = Chat.create!(model: "claude-3-5-haiku", provider: "bedrock")  # Cross-provider
chat = Chat.create!(
  model: "experimental-llm-v2",
  provider: "openrouter",
  assume_model_exists: true  # Creates Model record if not found
)

# Access rich model metadata through associations
chat.model.context_window  # => 128000
chat.model.capabilities    # => ["streaming", "function_calling", "structured_output"]
chat.model.pricing["text_tokens"]["standard"]["input_per_million"]  # => 2.5

# Works with Model objects too
model = Model.find_by(model_id: "gpt-4o")
chat = Chat.create!(model: model)

# Refresh models from provider APIs
Model.refresh!  # Populates/updates models table from all configured providers

The install generator creates a Model model by default:

# Custom model names supported
rails g ruby_llm:install chat:Discussion message:Comment model:LLModel

🌍 AWS Bedrock Regional Support

Cross-region inference now works correctly in all AWS regions:

# EU regions now work!
RubyLLM.configure do |config|
  config.bedrock_region = "eu-west-3"
end

# Automatically uses correct region prefix:
# - EU: eu.anthropic.claude-3-sonnet...
# - US: us.anthropic.claude-3-sonnet...
# - AP: ap.anthropic.claude-3-sonnet...
# - CA: ca.anthropic.claude-3-sonnet...

Thanks to @elthariel for the contribution! (#338)

🎵 MP3 Audio Support Fixed

OpenAI's Whisper API now correctly handles MP3 files:

# Previously failed with MIME type errors
chat.add_attachment("audio.mp3")
response = chat.ask("Transcribe this audio")  # Now works!

The fix properly converts audio/mpeg MIME type to the mp3 format string OpenAI expects. (#390)

🚀 Performance & Developer Experience

Simplified Installation

The post-install message is now concise and helpful, pointing to docs instead of overwhelming with text.

Better Generator Experience

All generators now support consistent interfaces:

# All use the same pattern
rails g ruby_llm:install chat:Chat message:Message
rails g ruby_llm:upgrade_to_v1_7 chat:Chat message:Message
rails g ruby_llm:chat_ui

ActiveStorage Integration

The install generator now automatically:

  • Installs ActiveStorage if not present
  • Configures RubyLLM for attachment support
  • Ensures smooth multimodal experiences out of the box

🔧 Fixes & Improvements

Provider Enhancements

  • Local provider models: Models.refresh! now supports Ollama and GPUStack with proper capability mapping
  • Provider architecture: Providers no longer call RubyLLM.models.find internally (#366)
  • Tool calling: Fixed OpenAI tool calls with missing function.arguments (#385, thanks @elthariel!)
  • Streaming callbacks: on_new_message now fires before API request, not after first chunk (#367)

Documentation & Testing

  • Documentation variables: Model names now use variables for easier updates
  • IRB compatibility: #instance_variables method now public for ls command (#374, thanks @matijs!)
  • Test improvements: Fixed CI issues with acts_as modules and database initialization
  • VCR enhancements: Better VertexAI recording and cassette management

Breaking Changes

with_params Behavior

with_params now takes precedence over internal defaults, allowing full control:

# You can now override ANY parameter
chat.with_params(max_tokens: 100)  # This now works!
chat.with_params(tools: [web_search_tool])  # Provider-specific features

Set RUBYLLM_DEBUG=true to see exactly what's being sent to the API.

Installation

gem 'ruby_llm', '1.7.0'

Upgrading from 1.6.x

  1. Update your Gemfile
  2. Run bundle update ruby_llm
  3. Run rails generate ruby_llm:upgrade_to_v1_7
  4. Run rails db:migrate
  5. Enjoy the new features! 🎉

Full backward compatibility maintained - the old acts_as API continues working with a deprecation warning.

Merged PRs.

New Contributors

Full Changelog: 1.6.4...1.7.0