Skip to content
This repository was archived by the owner on Mar 17, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
146 commits
Select commit Hold shift + click to select a range
2264580
Remove hardcode flash-attn disable setting (#2342)
Trangle Sep 1, 2023
24a8755
Document turning off proxy_buffering when api is streaming (#2337)
nathanstitt Sep 1, 2023
b039a66
Simplify huggingface api example (#2355)
merrymercy Sep 4, 2023
ea045e6
Update sponsor logos (#2367)
merrymercy Sep 5, 2023
85bec47
if LOGDIR is empty, then don't try output log to local file (#2357)
leiwen83 Sep 5, 2023
f99663c
add best_of and use_beam_search for completions interface (#2348)
leiwen83 Sep 6, 2023
3cf04c2
Extract upvote/downvote from log files (#2369)
merrymercy Sep 6, 2023
94f4dd6
Revert "add best_of and use_beam_search for completions interface" (#…
merrymercy Sep 6, 2023
dc3dd12
Improve doc (#2371)
merrymercy Sep 6, 2023
a5e6abf
add best_of and use_beam_search for completions interface (#2372)
leiwen83 Sep 7, 2023
1d703b2
update monkey patch for llama2 (#2379)
merrymercy Sep 7, 2023
56744d1
Make E5 adapter more restrict to reduce mismatch (#2381)
merrymercy Sep 7, 2023
6af0a7c
Update UI and sponsers (#2387)
merrymercy Sep 8, 2023
9b3147e
Use fsdp api for save save (#2390)
merrymercy Sep 10, 2023
a6167db
Release v0.2.27
merrymercy Sep 10, 2023
7dcdafe
Spicyboros + airoboros 2.2 template update. (#2392)
jondurbin Sep 11, 2023
b921f16
bugfix of openai_api_server for fastchat.serve.vllm_worker (#2398)
Rayrtfr Sep 11, 2023
13f40b3
Revert "bugfix of openai_api_server for fastchat.serve.vllm_worker" (…
merrymercy Sep 11, 2023
77aa4df
Revert "add best_of and use_beam_search for completions interface" (#…
merrymercy Sep 11, 2023
11b05bb
Release a v0.2.28 with bug fixes and more test cases
merrymercy Sep 11, 2023
a8088ba
Fix model_worker error (#2404)
wangxiyuan Sep 12, 2023
b49d789
Added google/flan models and fixed AutoModelForSeq2SeqLM when loading…
wangzhen263 Sep 12, 2023
7dfcf1a
Rename twitter to X (#2406)
karshPrime Sep 12, 2023
aa153d5
Update huggingface_api.py (#2409)
merrymercy Sep 12, 2023
3149253
Add support for baichuan2 models (#2408)
Sep 13, 2023
2e0e60b
Fixed character overlap issue when api streaming output (#2431)
Somezak1 Sep 18, 2023
c7e3e67
Support custom conversation template in multi_model_worker (#2434)
hi-jin Sep 18, 2023
c685951
Add Ascend NPU support (#2422)
zhangsibo1129 Sep 18, 2023
54a8353
Add raw conversation template (#2417) (#2418)
tobiabir Sep 18, 2023
1119c51
Improve docs & UI (#2436)
merrymercy Sep 18, 2023
658736f
Fix Salesforce xgen inference (#2350)
jaywonchung Sep 18, 2023
d26d9e7
Add support for Phind-CodeLlama models (#2415) (#2416)
tobiabir Sep 18, 2023
0a5f503
Add falcon 180B chat conversation template (#2384)
Btlmd Sep 18, 2023
318d070
Improve docs (#2438)
merrymercy Sep 18, 2023
9cf3c8b
add dtype and seed (#2430)
Ying1123 Sep 18, 2023
24acac1
Data cleaning scripts for dataset release (#2440)
merrymercy Sep 18, 2023
30a6ffc
merge google/flan based adapters: T5Adapter, CodeT5pAdapter, FlanAdap…
wangzhen263 Sep 18, 2023
16be5cf
Fix docs
merrymercy Sep 18, 2023
e4758da
Update UI (#2446)
merrymercy Sep 18, 2023
68f1fac
Add Optional SSL Support to controller.py (#2448)
brandonbiggs Sep 19, 2023
db8e271
Format & Improve docs
merrymercy Sep 19, 2023
c4c195c
Release v0.2.29 (#2450)
merrymercy Sep 20, 2023
a040cdc
Show terms of use as an JS alert (#2461)
merrymercy Sep 22, 2023
bcb8076
vllm worker awq quantization update (#2463)
dongxiaolong Sep 22, 2023
2855bf9
Fix falcon chat template (#2464)
merrymercy Sep 22, 2023
f8f302f
Fix chunk handling when partial chunks are returned (#2485)
siddartha-RE Sep 29, 2023
15a094e
Update openai_api_server.py to add an SSL option (#2484)
brandonbiggs Sep 29, 2023
7aace7d
Update vllm_worker.py (#2482)
shuishu Sep 29, 2023
faca3a3
fix typo quantization (#2469)
asaiacai Sep 29, 2023
8e8a604
fix vllm quanziation args
merrymercy Sep 29, 2023
77b3df1
Update README.md (#2492)
merrymercy Sep 29, 2023
f5c90f6
Huggingface api worker (#2456)
hnyls2002 Sep 29, 2023
f70de6b
Update links to lmsys-chat-1m (#2497)
merrymercy Sep 30, 2023
c478bbf
Update train code to support the new tokenizer (#2498)
Ying1123 Sep 30, 2023
bc22411
Third Party UI Example (#2499)
enochlev Sep 30, 2023
6b4fc64
Add metharme (pygmalion) conversation template (#2500)
AlpinDale Oct 1, 2023
46e5207
Optimize for proper flash attn causal handling (#2503)
siddartha-RE Oct 2, 2023
f5eee7d
Add Mistral AI instruction template (#2483)
lerela Oct 2, 2023
759dfbe
Update monitor & plots (#2506)
merrymercy Oct 2, 2023
f9fcc9d
Release v0.2.30 (#2507)
merrymercy Oct 2, 2023
e64ee0e
Fix for single turn dataset (#2509)
toslunar Oct 3, 2023
c3ad73a
replace os.getenv with os.path.expanduser because the first one doesn…
khalil-Hennara Oct 4, 2023
5573aae
Fix arena (#2522)
merrymercy Oct 6, 2023
dad34ea
Update Dockerfile (#2524)
Oct 9, 2023
9d27d68
add Llama2ChangAdapter (#2510)
lcw99 Oct 9, 2023
466da28
Add ExllamaV2 Inference Framework Support. (#2455)
leonxia1018 Oct 9, 2023
5dbc4f3
Improve docs (#2534)
merrymercy Oct 9, 2023
e448a0f
Fix warnings for new gradio versions (#2538)
merrymercy Oct 10, 2023
125f374
revert the gradio change; now works for 3.40
merrymercy Oct 10, 2023
0c37d98
Improve chat templates (#2539)
merrymercy Oct 10, 2023
cd7d048
Add Zephyr 7B Alpha (#2535)
lewtun Oct 11, 2023
f5a4911
Improve Support for Mistral-Instruct (#2547)
Steve-Tech Oct 12, 2023
f683fd1
correct max_tokens by context_length instead of raise exception (#2544)
liunux4odoo Oct 12, 2023
7b0ca39
Revert "Improve Support for Mistral-Instruct" (#2552)
merrymercy Oct 12, 2023
9f7afed
Fix Mistral template (#2529)
normster Oct 12, 2023
f19d449
Add additional Informations from the vllm worker (#2550)
SebastianBodza Oct 12, 2023
631d62f
Make FastChat work with LMSYS-Chat-1M Code (#2551)
CodingWithTim Oct 12, 2023
7ebc29c
Create `tags` attribute to fix `MarkupError` in rich CLI (#2553)
Steve-Tech Oct 13, 2023
8531cf6
move BaseModelWorker outside serve.model_worker to make it independen…
liunux4odoo Oct 13, 2023
ff3cb92
Misc style and bug fixes (#2559)
merrymercy Oct 13, 2023
e1a1f50
Fix README.md (#2561)
infwinston Oct 14, 2023
9db2143
release v0.2.31 (#2563)
merrymercy Oct 14, 2023
cb71875
resolves #2542 modify dockerfile to upgrade cuda to 12.2.0 and pydant…
alexdelapaz Oct 15, 2023
ee0d4d2
Add airoboros_v3 chat template (llama-2 format) (#2564)
jondurbin Oct 15, 2023
06092dd
Add Xwin-LM V0.1, V0.2 support (#2566)
REIGN12 Oct 15, 2023
ff66426
Fixed model_worker generate_gate may blocked main thread (#2540) (#2…
lvxuan263 Oct 16, 2023
7fbf5b1
feat: add claude-v2 (#2571)
congchan Oct 17, 2023
29de51f
Update vigogne template (#2580)
bofenghuang Oct 18, 2023
f79151b
Fix issue #2568: --device mps led to TypeError: forward() got an unex…
Phil-U-U Oct 18, 2023
f06b202
Add Mistral-7B-OpenOrca conversation_temmplate (#2585)
waynespa Oct 20, 2023
8e90d5c
docs: bit misspell comments model adapter default template name conve…
guspan-tanadi Oct 21, 2023
6a149bb
Update Mistral template (#2581)
Gk-rohan Oct 21, 2023
f752996
Fix <s> in mistral template
merrymercy Oct 21, 2023
d61d43e
Update README.md (vicuna-v1.3 -> vicuna-1.5) (#2592)
infwinston Oct 21, 2023
582f48b
Update README.md to highlight chatbot arena (#2596)
infwinston Oct 24, 2023
220257a
Add Lemur model (#2584)
ugolotti Oct 24, 2023
ab169f6
add trust_remote_code=True in BaseModelAdapter (#2583)
edisonwd Oct 24, 2023
cbf2853
Openai interface add use beam search and best of 2 (#2442)
leiwen83 Oct 24, 2023
09e4357
Update qwen and add pygmalion (#2607)
Trangle Oct 28, 2023
7a31d3b
feat: Support model AquilaChat2 (#2616)
fangyinc Nov 1, 2023
d5e4b27
Added settings vllm (#2599)
SebastianBodza Nov 1, 2023
af4dfe3
[Logprobs] Support logprobs=1 (#2612)
comaniac Nov 1, 2023
dd84d16
release v0.2.32
merrymercy Nov 1, 2023
40b235d
fix: Fix for OpenOrcaAdapter to return correct conversation template …
vjsrinath Nov 2, 2023
3d9430a
Make fastchat.serve.model_worker to take debug argument (#2628)
uinone Nov 2, 2023
fdefb5f
openchat 3.5 model support (#2638)
imoneoi Nov 3, 2023
d5a078b
xFastTransformer framework support (#2615)
a3213105 Nov 3, 2023
e8a839a
feat: support custom models vllm serving (#2635)
congchan Nov 5, 2023
86f044b
kill only fastchat process (#2641)
scenaristeur Nov 6, 2023
5d453e4
Update server_arch.png
merrymercy Nov 6, 2023
77932a1
Use conv.update_last_message api in mt-bench answer generation (#2647)
merrymercy Nov 7, 2023
32c41de
Improve Azure OpenAI interface (#2651)
infwinston Nov 7, 2023
f2810e5
Add required_temp support in jsonl format to support flexible tempera…
CodingWithTim Nov 8, 2023
ab01027
Pin openai version < 1 (#2658)
infwinston Nov 8, 2023
18f5692
Remove exclude_unset parameter (#2654)
snapshotpl Nov 9, 2023
2ab0026
Revert "Remove exclude_unset parameter" (#2666)
merrymercy Nov 9, 2023
09033af
added support for CodeGeex(2) (#2645)
peterwilli Nov 9, 2023
e46d97a
add chatglm3 conv template support in conversation.py (#2622)
ZeyuTeng96 Nov 10, 2023
e0b351a
UI and model change (#2672)
infwinston Nov 12, 2023
1901125
train_flant5: fix typo (#2673)
Force1ess Nov 12, 2023
a19866b
Fix gpt template (#2674)
infwinston Nov 12, 2023
a333a55
Update README.md (#2679)
merrymercy Nov 13, 2023
aeec0e0
feat: support template's stop_str as list (#2678)
congchan Nov 13, 2023
9cfeb15
Update exllama_v2.md (#2680)
jm23jeffmorgan Nov 15, 2023
a1324de
save model under deepspeed (#2689)
MrZhengXin Nov 18, 2023
fdf7b2c
Adding SSL support for model workers and huggingface worker (#2687)
lnguyen Nov 18, 2023
e53c73f
Check the max_new_tokens <= 0 in openai api server (#2688)
zeyugao Nov 19, 2023
8bd422b
Add Microsoft/Orca-2-7b and update model support docs (#2714)
BabyChouSr Nov 22, 2023
849a815
fix tokenizer of chatglm2 (#2711)
wangshuai09 Nov 22, 2023
af8d877
Template for using Deepseek code models (#2705)
AmaleshV Nov 22, 2023
85c797e
add support for Chinese-LLaMA-Alpaca (#2700)
zollty Nov 22, 2023
99d19ac
Make --load-8bit flag work with weights in safetensors format (#2698)
xuguodong1999 Nov 22, 2023
0bbeddc
Format code and minor bug fix (#2716)
merrymercy Nov 22, 2023
0a5ad3e
Bump version to v0.2.33 (#2717)
merrymercy Nov 22, 2023
3389cc3
fix tokenizer.pad_token attribute error (#2710)
wangshuai09 Nov 22, 2023
ff25295
support stable-vicuna model (#2696)
hi-jin Nov 23, 2023
6ac7d76
Exllama cache 8bit (#2719)
mjkaye Nov 23, 2023
1f21efb
Add Yi support (#2723)
infwinston Nov 23, 2023
a754c48
Add Hermes 2.5 [fixed] (#2725)
152334H Nov 23, 2023
c199c8d
Fix Hermes2Adapter (#2727)
lewtun Nov 26, 2023
cfba5f4
Fix YiAdapter (#2730)
Jingsong-Yan Nov 26, 2023
96aed4c
add trust_remote_code argument (#2715)
wangshuai09 Nov 26, 2023
3352306
Add revision arg to MT Bench answer generation (#2728)
lewtun Nov 26, 2023
76fbdef
Fix MPS backend 'index out of range' error (#2737)
suquark Nov 26, 2023
686ab04
add starling support (#2738)
infwinston Nov 27, 2023
a7ed47f
Merge remote-tracking branch 'upstream/main' into merge_1126
renning22 Nov 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix Salesforce xgen inference (lm-sys#2350)
  • Loading branch information
jaywonchung authored Sep 18, 2023
commit 658736fc45356e574ee62e991603307ffa4c8f55
7 changes: 3 additions & 4 deletions fastchat/conversation.py
Original file line number Diff line number Diff line change
Expand Up @@ -765,11 +765,10 @@ def get_conv_template(name: str) -> Conversation:
Conversation(
name="xgen",
system_message="A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n",
roles=("### Human: ", "###"),
sep_style=SeparatorStyle.NO_COLON_SINGLE,
roles=("### Human", "### Assistant"),
sep_style=SeparatorStyle.ADD_COLON_SINGLE,
sep="\n",
stop_token_ids=[50256, 0, 1, 2],
stop_str="<|endoftext|>",
stop_token_ids=[50256],
)
)

Expand Down
3 changes: 2 additions & 1 deletion fastchat/serve/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@ def generate_stream(
echo = bool(params.get("echo", True))
stop_str = params.get("stop", None)
stop_token_ids = params.get("stop_token_ids", None) or []
stop_token_ids.append(tokenizer.eos_token_id)
if tokenizer.eos_token_id not in stop_token_ids:
stop_token_ids.append(tokenizer.eos_token_id)

logits_processor = prepare_logits_processor(
temperature, repetition_penalty, top_p, top_k
Expand Down