LLM API Load Testing

Simple scripts for stress testing LLM API endpoints by measuring throughput under varying concurrent loads.

This was made for testing my Apple Intelligence Web API, which conforms to the same standards as OpenRouter and OpenAI.

Overview

Major files:

./src/main.py: Measure response times and throughput to varying concurrent loads
./src/plotting.py: Run data analysis and visualization
./results/data.csv: Data from experiments
./results/throughput_analysis.pdf: Data visualized with curve fit

Usage

0. Requirements

This project uses uv for dependency management. Dependencies will be automatically installed if you run the scripts with uv run.

Alternatively, install manually with pip:

pip install aiohttp numpy scipy matplotlib

1. Generate test data

Run the load test against your API endpoint:

uv run ./src/main.py > ./results/data.csv &

This will send batches of concurrent requests with varying load levels and output CSV data.

2. Analyze results

Visualize the throughput degradation:

uv run ./src/plotting.py

This generates a plot showing:

Raw throughput measurements
Binned means with error bars
Fitted exponential decay curve

Currently, throughput is estimated by dividing the number of characters by the average 4 tokens per character.

Contribution

Though I'm done with the project, I would welcome any PRs!

Here are some things that I think need changing:

Rather than running experiments by sending a set number of requests in batches, set a 'request rate' at which new requests will be sent to the API.
Use streaming completions to separate out latency and throughput for each request.
Count tokens using, e.g., tiktoken, or using the response.usage.completion_tokens field if the API supports it. Or, if streaming, can we assume that each chunk is one token?
Use a combination of argparse and configuration files (where appropriate) to specify run configuration, rather than having that information stored in the code.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
results		results
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM API Load Testing

Overview

Usage

0. Requirements

1. Generate test data

2. Analyze results

Contribution

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM API Load Testing

Overview

Usage

0. Requirements

1. Generate test data

2. Analyze results

Contribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages