Skip to content

[Bug]: CRITICAL 11-05 12:03:03 launcher.py:99] MQLLMEngine is already dead, terminating server process #10024

@LIUKAI0815

Description

@LIUKAI0815

Your current environment

INFO 11-05 12:03:00 engine.py:290] Added request chat-58cc8fe807d34717b775ea663d913bcb.
ERROR 11-05 12:03:00 client.py:250] RuntimeError('Engine loop has died')
ERROR 11-05 12:03:00 client.py:250] Traceback (most recent call last):
ERROR 11-05 12:03:00 client.py:250] File "/root/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/client.py", line 150, in run_heartbeat_loop
ERROR 11-05 12:03:00 client.py:250] await self._check_success(
ERROR 11-05 12:03:00 client.py:250] File "/root/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/client.py", line 314, in _check_success
ERROR 11-05 12:03:00 client.py:250] raise response
ERROR 11-05 12:03:00 client.py:250] RuntimeError: Engine loop has died
INFO 11-05 12:03:01 metrics.py:349] Avg prompt throughput: 6.5 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 5.0%, CPU KV cache usage: 0.0%.
INFO: 10.12.17.5:58280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO: 10.12.17.5:58489 - "GET /v1/models HTTP/1.1" 200 OK
CRITICAL 11-05 12:03:03 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 10.12.17.5:58489 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.

Model Input Dumps

export CUDA_VISIBLE_DEVICES=2
export VLLM_USE_MODELSCOPE= False
vllm serve ./Qwen2_5-14B-Instruct-AWQ
--host 0.0.0.0
--port 2015
--tensor-parallel-size 1
--gpu-memory-utilization 0.9
--trust-remote-code
--enforce-eager
--lora-modules role=/workspace/output/role/qwen/qwen2_5-14b-instruct-awq/v1-20241101-133149/checkpoint-1550
--enable-lora \

🐛 Describe the bug

INFO 11-05 12:03:00 engine.py:290] Added request chat-58cc8fe807d34717b775ea663d913bcb.
ERROR 11-05 12:03:00 client.py:250] RuntimeError('Engine loop has died')
ERROR 11-05 12:03:00 client.py:250] Traceback (most recent call last):
ERROR 11-05 12:03:00 client.py:250] File "/root/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/client.py", line 150, in run_heartbeat_loop
ERROR 11-05 12:03:00 client.py:250] await self._check_success(
ERROR 11-05 12:03:00 client.py:250] File "/root/miniconda3/envs/vllm/lib/python3.10/site-packages/vllm/engine/multiprocessing/client.py", line 314, in _check_success
ERROR 11-05 12:03:00 client.py:250] raise response
ERROR 11-05 12:03:00 client.py:250] RuntimeError: Engine loop has died
INFO 11-05 12:03:01 metrics.py:349] Avg prompt throughput: 6.5 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 1 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 5.0%, CPU KV cache usage: 0.0%.
INFO: 10.12.17.5:58280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO: 10.12.17.5:58489 - "GET /v1/models HTTP/1.1" 200 OK
CRITICAL 11-05 12:03:03 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 10.12.17.5:58489 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.

vllm 0.6.3.post1

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleOver 90 days of inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions