ChatGLM2 Support #1261

GoHomeToMacDonal · 2023-10-04T22:19:59Z

An implementation of ChatGLM 2 based on vLLM. This implementation adapts PagedAttentionWithRoPE and ParallelLinear layers for model inference.

simon-mo · 2023-11-02T18:55:49Z

Thank you for the contribution, unfortunately this PR seems to have some merge conflict and ChatGLM3 also came out. Feel free to coordinate the contribution here if you have bandwidth!

#1552

GoHomeToMacDonal · 2023-11-04T10:11:13Z

@simon-mo Hi, we have resolved the code conflict, and it can be directly merged into the main branch.

As chatglm3 does not change the model structure, this implementation can be directly adopted to chatglm3. Below is the testing code:

from vllm import LLM, SamplingParams

prompts = ["""<|system|>
You are ChatGLM3, a large language model trained by Zhipu.AI. Follow the user's instructions carefully. Respond using markdown.
<|user|>
Hello
<|assistant|>
"""]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="/home/skim/.cache/modelscope/hub/ZhipuAI/chatglm3-6b", trust_remote_code=True)

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

The output will be Hello! How can I assist you today?

zhuohan123

LGTM! Merged with main and fixed a small style issue. The code works with both ChatGLM2 and ChatGLM3 on one GPU in my case. Thank you for your contribution!

add support modelscope mode revert not affect file Support Yi model (vllm-project#1567) ChatGLM Support (vllm-project#1261)

Midnight-719 · 2023-11-14T08:00:30Z

hi ,can u tell me how to use it , I still have this problem: AttributeError: 'ChatGLMConfig' object has no attribute 'num_hidden_layers', Currently, I have updated to the latest version of VLLM

GoHomeToMacDonal · 2023-11-14T08:22:59Z

This is problem is caused by old version of transformers. I suggest upgrading both your transformers package and ChatGLM model to the recent versions.

Midnight-719 · 2023-11-14T08:26:06Z

This is problem is caused by old version of transformers. I suggest upgrading both your transformers package and ChatGLM model to the recent versions.

Yes, I have tried，transformers==4.35.0

GoHomeToMacDonal · 2023-11-14T08:38:56Z

This is problem is caused by old version of transformers. I suggest upgrading both your transformers package and ChatGLM model to the recent versions.

Yes, I have tried，transformers==4.35.0

Please provide more information of installed packages, and I will try to reproduce your problem later.

Jeffwan · 2023-11-15T00:27:14Z

@GoHomeToMacDonal If you use other prompts, it shows big difference between the native model.. Did you try more examples?

this is one example, seems it stopped after meeting some token

GoHomeToMacDonal · 2023-11-15T01:22:02Z

@GoHomeToMacDonal If you use other prompts, it shows big difference between the native model.. Did you try more examples?

this is one example, seems it stopped after meeting some token

I guess you used the default max_tokens=16 in the sampling parameters. I suggest to set max_tokens to a larger value, e.g., 1024. For more details, please refer to vllm/sampling_params.py.

In addition, as ChatGLM3 added some special tokens, e.g., <|system|>, use ChatGLMTokenizer.build_chat_input build the input token ids and feed them into vLLM will generate more stable results.

Jeffwan · 2023-11-15T05:27:05Z

@GoHomeToMacDonal It is the max_tokens setting issue. adding max_tokens works as expected. It's my first time using openai wrapper, thanks for the advice. BTW, I plan to use lm-sys/FastChat#2622 to build the conv template, I did some test and the result looks equivalent to what ChatGLMTokenizer.build_chat_input generates.

build the input token ids and feed them into vLLM will generate more stable results.
Just curious, does vllm provides the token interface?

wangruohui · 2023-11-15T08:26:49Z

Hello, I am using ChatGLM2 but it seems sometimes the output is not aligned with huggingface version. Could anyone help to take a look at #1670 ?

zengzikang · 2023-11-20T09:17:44Z

The latest version already supports GLM. Can GLM3 support official tool calls and other functions? Does it support dialogue function?

GoHomeToMacDonal · 2023-11-20T09:37:27Z

The latest version already supports GLM. Can GLM3 support official tool calls and other functions? Does it support the dialogue function?

You need to implement the corresponding code for function calls and prompt building. The vllm library focuses on model inference, i.e., it can substitute ChatGLMForConditionalGeneration.generate to llm.generate. I suggest building prompts based on the official ChatGLM 3 repository, and replacing model inference functions, e.g., ChatGLMForConditionalGeneration.chat, ChatGLMForConditionalGeneration.stream_chat with vLLM.

zengzikang · 2023-11-20T10:47:29Z

The latest version already supports GLM. Can GLM3 support official tool calls and other functions? Does it support the dialogue function?

You need to implement the corresponding code for function calls and prompt building. The vllm library focuses on model inference, i.e., it can substitute ChatGLMForConditionalGeneration.generate to llm.generate. I suggest building prompts based on the official ChatGLM 3 repository, and replacing model inference functions, e.g., ChatGLMForConditionalGeneration.chat, ChatGLMForConditionalGeneration.stream_chat with vLLM.

Using vllm to infer the GLM3 model, the speed is only about 13% faster, is it normal?

junior-zsy · 2023-11-21T08:36:11Z

@GoHomeToMacDonal This is still not supported for the Chatglm2-6b-32k version ,I have a message for the issue #1725

)  ### What this PR does / why we need it?  Refactor the token-wise padding mechanism to a more elegant implementation, correcting the padding logic errors introduced by the previous multimodal commit vllm-project#736 . This is a clean version of vllm-project#1259 . ### Does this PR introduce _any_ user-facing change?  ### How was this patch tested?  --------- Signed-off-by: Yizhou Liu <[email protected]>

ChatGLM2 Support

c05bfdb

zhuohan123 added the new-model Requests to new models label Oct 8, 2023

Fangzhou-Ai mentioned this pull request Oct 26, 2023

When will chatglm2-6b be supported? #869

Closed

simon-mo mentioned this pull request Nov 2, 2023

Call for Contribution: Support ChatGLM3 #1552

Closed

GoHomeToMacDonal and others added 4 commits November 4, 2023 12:42

Merge branch 'main' into main

a5e164f

Update __init__.py

9ac35d5

Update __init__.py

302d35d

Code formatting

2a8e01d

GoHomeToMacDonal force-pushed the main branch from 7591e13 to 7c8fd1a Compare November 4, 2023 09:46

code formatting

251f70c

GoHomeToMacDonal force-pushed the main branch from 7c8fd1a to 251f70c Compare November 4, 2023 09:52

simon-mo requested a review from zhuohan123 November 6, 2023 17:44

simon-mo mentioned this pull request Nov 6, 2023

Add support for chatglm2 #649

Closed

zhuohan123 added 2 commits November 7, 2023 00:04

Merge branch 'main' into GoHomeToMacDonal/main

0ad72a4

fix style

be19ad5

zhuohan123 approved these changes Nov 7, 2023

View reviewed changes

zhuohan123 merged commit 1a2bbc9 into vllm-project:main Nov 7, 2023

liuyhwangyh pushed a commit to liuyhwangyh/vllm that referenced this pull request Nov 8, 2023

code for test

330aad2

add support modelscope mode revert not affect file Support Yi model (vllm-project#1567) ChatGLM Support (vllm-project#1261)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Nov 13, 2023

ChatGLM Support (vllm-project#1261)

af0c612

kerthcet mentioned this pull request Dec 16, 2023

batching InftyAI/AMRS#45

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

ChatGLM Support (vllm-project#1261)

3d086b2

This was referenced Mar 6, 2024

Support for chatglm-6b #231

Closed

when to support chatglm2-6b? #247

Closed

luoqishuai mentioned this pull request Apr 2, 2024

chatglm3 prompt模板中间有空格是否正确 hiyouga/LLaMA-Factory#3095

Closed

1 task

Uh oh!

ChatGLM2 Support #1261

ChatGLM2 Support #1261

Uh oh!

Conversation

GoHomeToMacDonal commented Oct 4, 2023

Uh oh!

simon-mo commented Nov 2, 2023

Uh oh!

GoHomeToMacDonal commented Nov 4, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Midnight-719 commented Nov 14, 2023

Uh oh!

GoHomeToMacDonal commented Nov 14, 2023

Uh oh!

Midnight-719 commented Nov 14, 2023

Uh oh!

GoHomeToMacDonal commented Nov 14, 2023

Uh oh!

Jeffwan commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GoHomeToMacDonal commented Nov 15, 2023

Uh oh!

Jeffwan commented Nov 15, 2023

Uh oh!

wangruohui commented Nov 15, 2023

Uh oh!

zengzikang commented Nov 20, 2023

Uh oh!

GoHomeToMacDonal commented Nov 20, 2023

Uh oh!

zengzikang commented Nov 20, 2023

Uh oh!

junior-zsy commented Nov 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Jeffwan commented Nov 15, 2023 •

edited

Loading