Skip to content

Commit 5b6aeff

Browse files
authored
Found a bug in the codellama vllm model_len logic. (#380)
* Found a bug in the codellama vllm model_len logic. Also, let's just avoid the vLLM error by making sure max_num_batched_tokens >= max_model_len * nevermind I realized that if statement will never happen here.
1 parent 5e4d662 commit 5b6aeff

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,9 +174,9 @@
174174
"mammoth-coder": {"max_model_len": 16384, "max_num_batched_tokens": 16384},
175175
# Based on config here: https://huggingface.co/TIGER-Lab/MAmmoTH-Coder-7B/blob/main/config.json#L12
176176
# Can also see 13B, 34B there too
177-
"code-llama": {"max_model_len": 16384, "max_num_batched_tokens": 16384},
177+
"codellama": {"max_model_len": 16384, "max_num_batched_tokens": 16384},
178178
# Based on config here: https://huggingface.co/codellama/CodeLlama-7b-hf/blob/main/config.json#L12
179-
# Can also see 13B, 34B there too
179+
# Can also see 13B, 34B there too. Note, codellama is one word.
180180
"llama-2": {"max_model_len": None, "max_num_batched_tokens": 4096},
181181
"mistral": {"max_model_len": 8000, "max_num_batched_tokens": 8000},
182182
}

0 commit comments

Comments
 (0)