AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📄 Paper | 🤗 HF-Dataset | 📢 Blog | Contributors

📑 Table of Contents

Announcements
Introduction
Datasets
AI Agents
Multi-Agent Frameworks
System Diagram
Leaderboards
Docker Setup
Talks & Events
External Resources
Contributors

Announcements

2025-06-01: AssetOpsBench v1.0 released with 140+ industrial scenarios.
2025-09-01: CODS Competition launched. Access AI Agentic Challenge AssetOpsBench-Live
Upcoming Events: Tutorial at AAAI 2026 – Agents for Industry 4.0 Applications.
Stay tuned for new tracks, competitions, and community events.

Introduction

AssetOpsBench is a unified framework for developing, orchestrating, and evaluating domain-specific AI agents in industrial asset operations and maintenance.

It provides:

4 domain-specific agents
2 multi-agent orchestration frameworks

Designed for maintenance engineers, reliability specialists, and facility planners, it allows reproducible evaluation of multi-step workflows in simulated industrial environments.

Datasets: 140+ Scenarios

AssetOpsBench scenarios span multiple domains:

Domain	Example Task
IoT	"List all sensors of Chiller 6 in MAIN site"
FSMR	"Identify failure modes detected by Chiller 6 Supply Temperature"
TSFM	"Forecast 'Chiller 9 Condenser Water Flow' for the week of 2020-04-27"
WO	"Generate a work order for Chiller 6 anomaly detection"

Some tasks focus on a single domain, others are multi-step end-to-end workflows.
Explore all scenarios here.

AI Agents

Domain-Specific Agents

IoT Agent: get_sites, get_history, get_assets, get_sensors
FMSR Agent: get_sensors, get_failure_modes, get_failure_sensor_mapping
TSFM Agent: forecasting, timeseries_anomaly_detection
WO Agent: generate_work_order

Multi-Agent Frameworks

MetaAgent: reAct-based single-agent-as-tool orchestration
AgentHive: plan-and-execute sequential workflow

System Diagram

Visual overview of AssetOpsBench workflow:

Leaderboards

Evaluated with 7 Large Language Models
Trajectories scored using LLM Judge (Llama-4-Maverick-17B)
6-dimensional criteria measure reasoning, execution, and data handling

Example: MetaAgent leaderboard

🐳 Run AssetOpsBench in Docker

Pre-built Docker Images: assetopsbench-basic (minimal) & assetopsbench-extra (full)
Conda environment: assetopsbench
Full setup guide

cd /path/to/AssetOpsBench
chmod +x benchmark/entrypoint.sh
docker-compose -f benchmark/docker-compose.yml build
docker-compose -f benchmark/docker-compose.yml up

Talks & Events

Workshops: Participate in GenAIBench-26 at AAAI 2025 focusing on multi-agent AI workflows.
Webinars & Seminars: Learn best practices for industrial task automation with AI agents.
Competitions: Benchmark your agents on real-world industrial scenarios using AssetOpsBench.

External Resources

📄 Paper: AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations
🤗 HuggingFace: Scenario & Model Hub
📢 Blog: Insights, Tutorials, and Updates
🎥 Recorded Talks: Link coming soon.

Contributors

Thanks goes to these wonderful people ✨

_DhavalRepo18 💻 📖	_ShuxinLin 💻 📖	_jtrayfield 💻 📖	_nianjunz 💻 📖	_{ChathurangiShyalika} 💻 📖	_{PUSHPAK-JAISWAL} 💻 📖	_bradleyjeck 💻 📖
_florenzi002 💻 📖	_kushwaha001 💻	_{Mohit Gupta} 📖

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
aaai_website		aaai_website
aobench		aobench
benchmark		benchmark
metadata		metadata
scenarios		scenarios
src		src
.all-contributorsrc		.all-contributorsrc
.gitignore		.gitignore
.whitesource		.whitesource
LICENSE		LICENSE
README.md		README.md
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📑 Table of Contents

Announcements

Introduction

Datasets: 140+ Scenarios

AI Agents

Domain-Specific Agents

Multi-Agent Frameworks

System Diagram

Leaderboards

🐳 Run AssetOpsBench in Docker

Talks & Events

External Resources

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AssetOpsBench: Benchmarking AI Agents for Industrial Asset Operations & Maintenance

📑 Table of Contents

Announcements

Introduction

Datasets: 140+ Scenarios

AI Agents

Domain-Specific Agents

Multi-Agent Frameworks

System Diagram

Leaderboards

🐳 Run AssetOpsBench in Docker

Talks & Events

External Resources

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages