Korean Canonical Legal Benchmark (KCL)

This repository provides the official evaluation implementation for Korean Canonical Legal Benchmark.

🎉 Our paper has been accepted to the 19th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2026, main, short.)!

Why KCL?

KCL is designed to disentangle knowledge coverage from evidence-grounded reasoning.

KCL supports two complementary evaluation axes:

Knowledge Coverage: performance without extra context.
Evidence-Grounded Reasoning: performance with per-question supporting precedents provided in-context.

For essay questions, KCL further offers instance-level rubrics to enable LLM-as-a-Judge automated scoring.

For more information, please refer to our paper

Intended Uses

Separating knowledge vs. reasoning by comparing vanilla and with-precedent settings.
Legal RAG research using question-aligned gold precedents to establish retriever/reader upper bounds.
Fine-grained feedback via rubric-level diagnostics on essay outputs.

Components

KCL-Essay (open-ended generation)
- 169 questions, 550 supporting precedents, 2,739 instance-level rubrics.
KCL-MCQA (five-choice question answering)
- 283 questions, 1,103 supporting precedents.

Usage

Installation

git clone https://github.com/lbox-kr/kcl.git
cd kcl
uv sync

Prepare Environment Settings Using .env

# for logging
HYDRA_FULL_ERROR=1

# for vertex
GOOGLE_APPLICATION_CREDENTIALS=.vertex_credentials.json

# for openai
OPENAI_API_KEY="sk-..."

# for bedrock
AWS_ACCESS_KEY_ID="..."
AWS_SECRET_ACCESS_KEY="..."
AWS_SESSION_TOKEN="..."

KCL Essay

Inference

./scripts/infer/run_infer.sh \
./scripts/infer/configs/kcl_essay.yaml \
model_name=gemini-2.5-flash \
tasks_kwargs.with_precedents=True
# model_kwargs.thinking_budget=-1  # Optional: OAIModel defaults to "medium"

Evaluation

./scripts/eval/run_eval.sh \
./scripts/eval/configs/kcl_essay.yaml \
./outputs_infer/kcl_essay/gemini-2.5-flash/2025-10-15_10-04-43 \
n_jobs=8

Cost

Evaluation Cost Statistics (KCL-Essay)

169 questions – Gemini Flash token usage
Input: 6,896,400 tokens × $0.30 / 1M
Output: 780,173 tokens × $2.50 / 1M
Approximate cost per run: less than $5 -> Using caching, the cost can be reduced further.

KCL MCQA

Inference

./scripts/infer/run_infer.sh \
./scripts/infer/configs/kcl_mcqa.yaml \
model_name=gemini-2.5-flash
# model_kwargs.thinking_budget=-1  # Optional: OAIModel defaults to "medium"

Evaluation

./scripts/eval/run_eval.sh \
./scripts/eval/configs/kcl_mcqa.yaml \
./outputs_infer/kcl_mcqa/gemini-2.5-flash/2025-10-15_12-33-09 \
n_jobs=8

For Local Model

The evaluation code assumes a locally hosted internal model exposed via an OpenAI-compatible API.
The local model is configured using a YAML file, as shown below:

model_name: "google/gemma-3-27b-it"
model_kwargs:
  port: 8000

tasks: kcl_{essay|mcqa}
tasks_kwargs:
  with_precedents: False

n_jobs: 8
verbose: False

hydra:
  run:
    dir: outputs_infer/${tasks}/${model_name}/${now:%Y-%m-%d_%H-%M-%S}

Save this configuration file as:
scripts/infer/configs/kcl_{mcqa|essay}_local.yaml

Then, run the inference using the same command as follows:

./scripts/infer/run_infer.sh \
./scripts/infer/configs/kcl_{mcqa|essay}_local.yaml

Note: The evaluation script allows model directory names with suffixes (e.g., gemma-3-27b-it_no_reasoning). The directory name only needs to start with the base model name (the part after the last / in model_name).

Citation

@inproceedings{
    oh2026korean,
    title={Korean Canonical Legal Benchmark: Toward Knowledge-Independent Evaluation of {LLM}s' Legal Reasoning Capabilities},
    author={Hongseok Oh and Wonseok Hwang and Kyoung-Woon On},
    booktitle={19th Conference of the European Chapter of the Association for Computational Linguistics},
    year={2026},
    url={https://openreview.net/forum?id=Dw0sFP4l5s}
}

License

Our evaluation code and dataset are licensed under the CC BY-NC 4.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
scripts		scripts
src/kcl		src/kcl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Korean Canonical Legal Benchmark (KCL)

Why KCL?

Intended Uses

Components

Usage

Installation

Prepare Environment Settings Using .env

KCL Essay

Inference

Evaluation

Cost

KCL MCQA

Inference

Evaluation

For Local Model

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Korean Canonical Legal Benchmark (KCL)

Why KCL?

Intended Uses

Components

Usage

Installation

Prepare Environment Settings Using .env

KCL Essay

Inference

Evaluation

Cost

KCL MCQA

Inference

Evaluation

For Local Model

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages