i fine-tuning a model llama-3.1-1b-instruct-ft with llama-3.1-1b-instruct, and i try to chat with llama-31-1b-instruct-ft, always 502. i need to deploy llama-3.1-1b-instruct-ft by nemo deployment? and how?
Hi @whitewizard
Which version of Nemo (Microservices?) are you using?
For NeMo Microservices, there is an example here to create a deployment configuration and deploy the model using NeMo Deployment Management Service. While it deploys an embedding model, the steps can be used as a reference.
You can also find out more about NeMo DMS API reference here.
Thanks
i upgrade the nmp to laterst version 25.11.0,still error.
i follow this demo ,GenerativeAIExamples/nemo/data-flywheel/tool-calling/2_finetuning_and_inference.ipynb at main · NVIDIA/GenerativeAIExamples
when 3.2 Send an Inference Call to NIM, throw the error
pod.log (132.0 KB) and the meta/llama-3.1-1b-instruct nim pod error log
the error log sent.
