refactor: optimize agent list payload and improve multimodal detection logic by eviaaaaa · Pull Request #12942 · infiniflow/ragflow

eviaaaaa · 2026-02-02T06:56:19Z

Description

This PR focuses on API performance optimization and refining the model capability detection logic in the Agent/Canvas module.

1. Performance Optimization (Backend)

Changes: Removed cls.model.dsl from query fields in UserCanvasService.get_by_tenant_ids.
Reasoning: The dsl object is large and unnecessary for the Agent list view. Excluding it reduces the payload size of the /v1/canvas/list API, leading to faster serialization and reduced network latency.
Consistency: Full DSL data remains accessible via the individual /v1/canvas/get/<id> endpoint used in the detail view.

2. Multimodal Detection Refinement (Frontend)

Changes: Replaced model_type === LlmModelType.Image2text with tags?.includes('IMAGE2TEXT').
Reasoning: In RAGFlow, model_type defines the primary role of a model (e.g., chat). However, many advanced Chat models are also vision-capable. Since model_type is a single-value field, it cannot represent these multiple capabilities.
Solution: Utilizing the tags field (which supports multiple attributes) to check for IMAGE2TEXT ensures that models like gpt-5.2-pro correctly display multimodal input options.

Type of Change

Bug fix (logic correction for multimodal detection)
Optimization (performance improvement for list API)

Main Changes

api/db/services/canvas_service.py: Optimized DB query by excluding heavy DSL fields.
web/src/pages/agent/form/agent-form/index.tsx: Enhanced capability detection using the tags system.

Verification

Verified Agent list loads faster with reduced response payload.
Confirmed that chat models with the IMAGE2TEXT tag now correctly enable the multimodal input UI.

Since model_type only represents the primary category (e.g., 'chat'), it cannot capture auxiliary capabilities. Switching to 'IMAGE2TEXT' tag detection allows multimodal support for versatile models like gpt-5.2-pro.

The dsl field in the agent list is typically large and causes unnecessary network overhead. DSL is now only fetched in the detail view.

codecov · 2026-02-02T08:19:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.46%. Comparing base (23bdf25) to head (ffa09cf).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #12942   +/-   ##
=======================================
  Coverage   44.46%   44.46%           
=======================================
  Files          43       43           
  Lines        9266     9266           
  Branches      107      107           
=======================================
  Hits         4120     4120           
  Misses       5127     5127           
  Partials       19       19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eviaaaaa added 2 commits February 2, 2026 14:32

fix(web): detect multimodal capability via tags instead of model_type

8f92519

Since model_type only represents the primary category (e.g., 'chat'), it cannot capture auxiliary capabilities. Switching to 'IMAGE2TEXT' tag detection allows multimodal support for versatile models like gpt-5.2-pro.

perf(api): remove dsl field from canvas list to reduce payload size

ffa09cf

The dsl field in the agent list is typically large and causes unnecessary network overhead. DSL is now only fetched in the detail view.

dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 🌈 python Pull requests that update Python code 🐞 bug Something isn't working, pull request that fix bug. 🧰 typescript Pull requests that update Typescript code labels Feb 2, 2026

KevinHuSh added the ci Continue Integration label Feb 2, 2026

KevinHuSh marked this pull request as draft February 2, 2026 07:57

KevinHuSh marked this pull request as ready for review February 2, 2026 07:57

KevinHuSh merged commit 2e5a186 into infiniflow:main Feb 2, 2026
1 check passed

dosubot bot mentioned this pull request Feb 5, 2026

[Bug]: The image2txt type cannot be selected when adding a model in vLLM. #13006

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: optimize agent list payload and improve multimodal detection logic#12942

refactor: optimize agent list payload and improve multimodal detection logic#12942
KevinHuSh merged 2 commits intoinfiniflow:mainfrom
eviaaaaa:fix/agent-list-dsl-and-multimodal-detection

eviaaaaa commented Feb 2, 2026

Uh oh!

codecov bot commented Feb 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eviaaaaa commented Feb 2, 2026

Description

1. Performance Optimization (Backend)

2. Multimodal Detection Refinement (Frontend)

Type of Change

Main Changes

Verification

Uh oh!

codecov bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Feb 2, 2026 •

edited

Loading