Name	Name	Last commit message	Last commit date
parent directory ..
configs-example	configs-example
data	data
scripts	scripts
.env-example	.env-example
README.md	README.md
__init__.py	__init__.py

Name

Last commit message

Last commit date

configs-example

Evaluation Memory Framework

This repository provides tools and scripts for evaluating the LoCoMo dataset using various models and APIs.

Installation

Set the PYTHONPATH environment variable:
```
export PYTHONPATH=../src
cd evaluation
```
Install the required dependencies:
```
poetry install --extras all --with eval
```

Configuration

Copy the .env-example file to .env, and fill in the required environment variables according to your environment and API keys.
Copy the configs-example/ directory to a new directory named configs/, and modify the configuration files inside it as needed. This directory contains model and API-specific settings.

Evaluation Scripts

LoCoMo Evaluation

⚙️ To evaluate the LoCoMo dataset using one of the supported memory frameworks — memos, mem0, or zep — run the following script:

# Edit the configuration in ./scripts/run_locomo_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_locomo_eval.sh

✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: OpenAI Memory on LoCoMo - Evaluation Guide.

LongMemEval Evaluation

First prepare the dataset longmemeval_s from https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned , and save it as data/longmemeval/longmemeval_s.json

# Edit the configuration in ./scripts/run_lme_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_lme_eval.sh

prefEval Evaluation

personaMem Evaluation

get questions_32k.csv and shared_contexts_32k.jsonl from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at data/personamem/

./scripts/run_pm_eval.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Evaluation Memory Framework

Installation

Configuration

Evaluation Scripts

LoCoMo Evaluation

LongMemEval Evaluation

prefEval Evaluation

personaMem Evaluation

FilesExpand file tree

evaluation

Directory actions

More options

Directory actions

More options

Latest commit

History

evaluation

Folders and files

parent directory

README.md

Evaluation Memory Framework

Installation

Configuration

Evaluation Scripts

LoCoMo Evaluation

LongMemEval Evaluation

prefEval Evaluation

personaMem Evaluation