先说一下什么是大模型 微调 or 训练(ps: 本人浅显理解)
训练: 过去一般指的是从0倒1的制作训练一个模型, 这里面包含: 初始化模型参数, 大量的数据迭代, 训练
微调: 指的是在已有的模型上, 继续训练, 继续优化, 喂自己特定场景的语料
但是在大模型大热的环境下, 大多数张嘴就是大模型训练啊, 训练大模型啊~巴拉巴拉, 但实际上指的都是微调(我所在的公司反正是这样)
本文目的, 让你迅速尝试自己微调一个大模型, 因此概念性的东西, 下文就不多解释了
本文选用的是LLaMA-Factory进行大模型微调https://github.com/hiyouga/LLaMA-Factory
下文正式开始微调
安装依赖
python中某些包的版本是和系统GPU CUDA版本挂钩的, 因此本文默认你的CUDA版本和我完全一致, 即为: Driver Version: 535.161.08 CUDA Version: 12.6
这是我本地的依赖, 推荐使用miniforge3进行安装, 避免各种依赖问题
accelerate 1.2.1
archspec 0.2.3
boltons 24.0.0
Brotli 1.1.0
certifi 2024.12.14
cffi 1.17.1
charset-normalizer 3.4.1
colorama 0.4.6
conda 24.11.2
conda-libmamba-solver 24.9.0
conda-package-handling 2.4.0
conda_package_streaming 0.11.0
distro 1.9.0
filelock 3.16.1
frozendict 2.4.6
fsspec 2024.12.0
h2 4.1.0
hpack 4.0.0
huggingface-hub 0.27.1
hyperframe 6.0.1
idna 3.10
Jinja2 3.1.5
jsonpatch 1.33
jsonpointer 3.0.0
libmambapy 1.5.12
mamba 1.5.12
MarkupSafe 3.0.2
menuinst 2.2.0
mpmath 1.3.0
networkx 3.4.2
numpy 2.2.1
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.2
pip 24.3.1
platformdirs 4.3.6
pluggy 1.5.0
psutil 6.1.1
pycosat 0.6.6
pycparser 2.22
PySocks 1.7.1
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
ruamel.yaml 0.18.8
ruamel.yaml.clib 0.2.8
safetensors 0.5.2
setuptools 75.6.0
sympy 1.13.1
tokenizers 0.21.0
torch 2.5.1
tqdm 4.67.1
transformers 4.47.1
triton 3.1.0
truststore 0.10.0
typing_extensions 4.12.2
urllib3 2.3.0
wheel 0.45.1
zstandard 0.23.0
安装 LLaMA-Factory
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
数据集格式
这里根据 LLaMA-Factory官方示例, 只介绍2中, 其他的移步github自行查看
identity.json
下面的 {{name}} 和 {{author}} 你可以替换为自己想要训练进去的数据
[
{
"instruction": "hi",
"input": "",
"output": "Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?"
},
{
"instruction": "hello",
"input": "",
"output": "Hello! I am {{name}}, an AI assistant developed by {{author}}. How can I assist you today?"
}
]
alpaca_zh_demo.json
[
{
"instruction": "识别并解释给定列表中的两个科学理论:细胞理论和日心说。",
"input": "",
"output": "细胞理论是生物科学的一个理论,它认为所有生命体都是由微小的基本单元——细胞所构成。这是生物学的一个基础理论,认为细胞是所有生物的基本结构和功能单位,所有的生物都是由一个或多个细胞组成,细胞只能通过细胞分裂产生新的细胞。这一理论由薛定谔、施瓦内和雪莱于1839年首次提出。\n\n日心说是指太阳是太阳系的中心,也就是说,行星围绕太阳旋转的理论。这个理论打破了传统的地心说观点,认为地球并不是宇宙的中心。日心说的提出者是尼古拉·哥白尼,他在16世纪初发表了他的著作《天体运行论》,阐述了太阳系行星围绕太阳运行的模型,为天文学的发展做出了巨大贡献。"
},
{
"instruction": "输入三支篮球队的名称并生成一个适当的口号。",
"input": "输入:俄克拉荷马城雷霆队,芝加哥公牛队,布鲁克林网队。",
"output": "输出: \"雷霆,公牛和网队:各显神通,角逐群雄!\""
}
]
# 创建一个文件夹
>> # mkdir demo; cd demo
>> # mkdir data # 存放数据集文件
>> # ls
alpaca_zh_demo.json dataset_info.json identity.json
# 一共有3个文件
# identity.json alpaca_zh_demo.json 这2个你可以自己创建也可以从github上下载
# dataset_info.json # 这个文件, 这里直接从github上下载即可, 至于是什么, 你可以自己看看
# https://github.com/hiyouga/LLaMA-Factory/blob/main/data/dataset_info.json
准备你想要微调的大模型
我这里使用Phi-3.5-mini-instruct
>> # ls
alpaca_zh_demo.json dataset_info.json identity.json Phi-3.5-mini-instruct
开始微调
创建微调脚本
# 文件名: train_lora.yaml
### model
model_name_or_path: /demo/Phi-3.5-mini-instruct # 模型目录
trust_remote_code: true
### method
stage: sft
do_train: true
finetuning_type: lora # 微调方法
lora_target: all
### dataset
dataset: identity,alpaca_zh_demo # 数据集
template: phi # 模板 --> https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#supported-models
cutoff_len: 2048
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
### output
output_dir: saves/Phi-3.5-mini-instruct/lora/sft # 保存路径
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true
### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
### eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500
# 微调
llamafactory-cli train ./train_lora.yaml
模型chat测试
# inference.yaml
model_name_or_path: /demo/Phi-3.5-mini-instruct # 模型目录
adapter_name_or_path: saves/Phi-3.5-mini-instruct/lora/sft
template: phi
infer_backend: huggingface # choices: [huggingface, vllm]
trust_remote_code: true
llamafactory-cli chat ./inference.yaml
模型导出
# merge_lora.yaml
### Note: DO NOT use quantized model or quantization_bit when merging lora adapters
### model
model_name_or_path: /demo/Phi-3.5-mini-instruct # 模型目录
adapter_name_or_path: saves/Phi-3.5-mini-instruct/lora/sft
template: phi
finetuning_type: lora
trust_remote_code: true
### export
export_dir: models/Phi-3.5-mini-instruct_lora_sft
export_size: 2
export_device: cpu
export_legacy_format: false
llamafactory-cli export ./merge_lora.yaml
补充一下版本信息
>> # python -V
Python 3.12.8
>> # nvidia-smi
Tue Jan 14 10:01:45 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.6 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-PCIE-32GB Off | 00000000:2F:00.0 Off | 0 |
| N/A 35C P0 26W / 250W | 0MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
>> # cat /etc/issue
Ubuntu 22.04.4 LTS \n \l
>> # uname -a
Linux 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 9 17:09:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
补充一个, 可以0成本实践的教程 (1.15)
使用腾讯云免费cloudstudio实例, 微调phi3.5
(base) root@VM-7-225-ubuntu:/workspace# apt install git-lfs -y
(base) root@VM-7-225-ubuntu:/workspace# git lfs install
(base) root@VM-7-225-ubuntu:/workspace# pip install torch==2.5.1
(base) root@VM-7-225-ubuntu:/workspace# git clone https://www.modelscope.cn/LLM-Research/Phi-3.5-mini-instruct.git
(base) root@VM-7-225-ubuntu:/workspace# git clone https://github.com/hiyouga/LLaMA-Factory
(base) root@VM-7-225-ubuntu:/workspace# cd LLaMA-Factory/
(base) root@VM-7-225-ubuntu:/workspace# pip install -e ".[torch,metrics]"
(base) root@VM-7-225-ubuntu:/workspace# cd ..
(base) root@VM-7-225-ubuntu:/workspace# cp -a LLaMA-Factory/data .
(base) root@VM-7-225-ubuntu:/workspace# sed -i 's/{{name}}/岸边露伴/g' data/identity.json
(base) root@VM-7-225-ubuntu:/workspace# sed -i 's/{{author}}/杜王町Ai研究院/g' data/identity.json
(base) root@VM-7-225-ubuntu:/workspace# cat data/identity.json | head -n 6
[
{
"instruction": "hi",
"input": "",
"output": "Hello! I am 岸边露伴, an AI assistant developed by 杜王町Ai研究院. How can I assist you today?"
},
(base) root@VM-7-225-ubuntu:/workspace# tree -L 2
.
├── LLaMA-Factory
│ ├── ...
├── Phi-3.5-mini-instruct
│ ├── ....
├── README.md
├── data
│ ├── ...
├── inference.yaml
├── merge_lora.yaml
└── train_lora.yaml
llamafactory-cli train ./train_lora.yaml # 微调 GPU 负载: `8359MiB / 15360MiB`
llamafactory-cli chat ./inference.yaml # Chat
llamafactory-cli export ./merge_lora.yaml # 导出