Skip to content

[Feature]: Expose ONNX Runtime SessionOptions (specifically enable_cpu_mem_arena) to address memory leaks #570

@flockoftravi

Description

@flockoftravi

What feature would you like to request?

Description:
I'm experiencing a memory leak in my RAG application that uses fastembed (v0.7.3, not fastembed-gpu). Through memory profiling and troubleshooting, I've traced the issue to ONNX Runtime's memory management.

Problem:
The memory leak appears to be related to known issues in ONNX Runtime:

Proposed Solution:
Expose ONNX Runtime SessionOptions in fastembed, specifically enable_cpu_mem_arena. This parameter:

  • Is available in onnxruntime v1.20.0+
  • Defaults to True
  • Is currently not exposed in fastembed
  • Documentation is here.

Testing:
I created a patch that injects enable_cpu_mem_arena = False into fastembed.common.onnx_model.OnnxModel._load_onnx_model, which successfully eliminates/constrains the memory leak in my application.

Request:
I'd like to contribute a PR that exposes these (and potentially other useful) ONNX Runtime session options in fastembed. This would allow users to mitigate ONNX Runtime issues at the fastembed level while root causes are addressed upstream. I'm happy to discuss the best approach for implementing this without bloating the API.

Is there any additional information you would like to provide?

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions