Name	Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github	.github
src/mockllm	src/mockllm
tests	tests
.gitignore	.gitignore
LICENSE	LICENSE
Makefile	Makefile
README.md	README.md
example.responses.yml	example.responses.yml
main.py	main.py
poetry.lock	poetry.lock
pyproject.toml	pyproject.toml

Mock LLM Server

A FastAPI-based mock LLM server that mimics OpenAI and Anthropic API formats. Instead of calling actual language models, it uses predefined responses from a YAML configuration file.

This is made for when you want a deterministic response for testing or development purposes.

Check out the CodeGate project when you're done here!

Features

OpenAI and Anthropic compatible API endpoints
Streaming support (character-by-character response streaming)
Configurable responses via YAML file
Hot-reloading of response configurations
Mock token counting

Installation

From PyPI

pip install mockllm

From Source

Clone the repository:

git clone https://github.com/stacklok/mockllm.git
cd mockllm

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies:

poetry install  # Install with all dependencies
# or
poetry install --without dev  # Install without development dependencies

Usage

Set up the responses.yml

cp example.responses.yml responses.yml

Start the server:

poetry run python -m mockllm

Or using uvicorn directly:

poetry run uvicorn mockllm.server:app --reload

The server will start on http://localhost:8000

Send requests to the API endpoints:

OpenAI Format

Regular request:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mock-llm",
    "messages": [
      {"role": "user", "content": "what colour is the sky?"}
    ]
  }'

Streaming request:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mock-llm",
    "messages": [
      {"role": "user", "content": "what colour is the sky?"}
    ],
    "stream": true
  }'

Anthropic Format

Regular request:

curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "messages": [
      {"role": "user", "content": "what colour is the sky?"}
    ]
  }'

Streaming request:

curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "messages": [
      {"role": "user", "content": "what colour is the sky?"}
    ],
    "stream": true
  }'

Configuration

Response Configuration

Responses are configured in responses.yml. The file has three main sections:

responses: Maps input prompts to predefined responses
defaults: Contains default configurations like the unknown response message
settings: Contains server behavior settings like network lag simulation

Example responses.yml:

responses:
  "what colour is the sky?": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
  "what is 2+2?": "2+2 equals 9."

defaults:
  unknown_response: "I don't know the answer to that. This is a mock response."

settings:
  lag_enabled: true
  lag_factor: 10  # Higher values = faster responses (10 = fast, 1 = slow)

Network Lag Simulation

The server can simulate network latency for more realistic testing scenarios. This is controlled by two settings:

lag_enabled: When true, enables artificial network lag
lag_factor: Controls the speed of responses
- Higher values (e.g., 10) result in faster responses
- Lower values (e.g., 1) result in slower responses
- Affects both streaming and non-streaming responses

For streaming responses, the lag is applied per-character with slight random variations to simulate realistic network conditions.

Hot Reloading

The server automatically detects changes to responses.yml and reloads the configuration without restarting the server.

Testing

To run the tests:

poetry run pytest

Contributing

Contributions are welcome! Please open an issue or submit a PR.

License

This project is licensed under the Apache 2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mock LLM Server

Features

Installation

From PyPI

From Source

Usage

OpenAI Format

Anthropic Format

Configuration

Response Configuration

Network Lag Simulation

Hot Reloading

Testing

Contributing

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

StacklokLabs/mockllm

Folders and files

Latest commit

History

Repository files navigation

Mock LLM Server

Features

Installation

From PyPI

From Source

Usage

OpenAI Format

Anthropic Format

Configuration

Response Configuration

Network Lag Simulation

Hot Reloading

Testing

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages