This repository provides tools and scripts for evaluating the LoCoMo dataset using various models and APIs.
-
Set the
PYTHONPATHenvironment variable:export PYTHONPATH=../src cd evaluation
-
Install the required dependencies:
poetry install --extras all --with eval
-
Copy the
.env-examplefile to.env, and fill in the required environment variables according to your environment and API keys. -
Copy the
configs-example/directory to a new directory namedconfigs/, and modify the configuration files inside it as needed. This directory contains model and API-specific settings.
⚙️ To evaluate the LoCoMo dataset using one of the supported memory frameworks — memos, mem0, or zep — run the following script:
# Edit the configuration in ./scripts/run_locomo_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_locomo_eval.sh✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: OpenAI Memory on LoCoMo - Evaluation Guide.
First prepare the dataset longmemeval_s from https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned
, and save it as data/longmemeval/longmemeval_s.json
# Edit the configuration in ./scripts/run_lme_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_lme_eval.shget questions_32k.csv and shared_contexts_32k.jsonl from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at data/personamem/
./scripts/run_pm_eval.sh