We introduce CharacterBot, a model designed to replicate both the linguistic patterns and distinctive thought processes of a character. In this repository, we present Lu Xun—a renowned Chinese writer—as a case study.
The experimental dataset luxun_essay.json comprises 638 articles sourced from Wikisource. These collections span the entirety of Lu Xun’s mature essayistic output, reflecting a diverse array of themes from his intellectual career. All texts are in the public domain, ensuring unrestricted scholarly usage. As Lu Xun was a Chinese writer, the repository includes Chinese textual data and code annotations.
To set up the environment and install necessary dependencies, follow the steps below:
python3conda create --name envpip3 install -r requirements.txt
Authorial Perspective Reframing
Generate the Authorial Perspective Reframing pre-training data for the following commands:
cd characterbot
python authorial_perspective_reframing.py
Multiple-choice Questions
Generate data for the multiple-choice questions task:
python multiple_choice_questions_data.py
Generative Question Answering
Generate data for the generative question answering task:
python generative_qa_data.py
Style Transfer
Generate data for the style transfer task:
python style_transfer_data.py
CharLoRA requires modifications to three editable Python libraries. Make sure to download these libraries as specified in the project requirements. Then, apply the following updates:
cd /path/to/characterbot/train_with_charlora
cp aligner.py /path/to/LLaMA-Factory/src/llamafactory/data/aligner.py
cp collator.py /path/to/LLaMA-Factory/src/llamafactory/data/collator.py
cp supervised.py /path/to/LLaMA-Factory/src/llamafactory/data/processors/supervised.py
cp peft_model.py /path/to/peft/src/peft/peft_model.py
cp save_and_load.py /path/to/peft/src/peft/utils/save_and_load.py
cp layer.py /path/to/peft/src/peft/tuners/lora/layer.py
cp modeling_qwen2.py /path/to/transformers/src/transformers/models/qwen2/modeling_qwen2.py
Since both pre-training and fine-tuning processes rely on LLaMA-Factory, integrate your generated data into its structure.
Place data files into LLaMA-Factory's data directory:
cp pre_train.json /path/to/LLaMA-Factory/data/pre_train.json
cp fine_tune.json /path/to/LLaMA-Factory/data/fine_tune.json
cp test.json /path/to/LLaMA-Factory/data/test.json
cp dataset_info.json /path/to/LLaMA-Factory/data/dataset_info.json
Place training configuration files into the examples directory:
cp pre_train.yaml /path/to/LLaMA-Factory/examples/train_lora/pre_train.yaml
cp fine_tune.yaml /path/to/LLaMA-Factory/examples/train_lora/fine_tune.yaml
Run pre-training using the following command:
cd /path/to/LLaMA-Factory
llamafactory-cli train examples/train_lora/pre_train.yaml
Run fine-tuning with this command:
llamafactory-cli train examples/train_lora/fine-tuning.yaml
The output model will be saved at /path/to/LLaMA-Factory/saves/fine_tune
After training, you can generate responses for each task. Make sure to update the model and data paths as needed. Some data example format are shown in example. For each task, follow these instructions:
cd /path/to/characterbot/test/
cp peft_model_test.py /path/to/peft/src/peft/peft_model.py
cp mcq_load.py /path/to/peft/src/peft/utils/save_and_load.py
python multiple_choice_gen.py
cp gqa_load.py /path/to/peft/src/peft/utils/save_and_load.py
python generate_qa_gen.py
cp st_load.py /path/to/peft/src/peft/utils/save_and_load.py
python style_transfer_gen.py
Evaluate the performance of the model. Note that the evaluation code for multiple-choice questions is the same as the generation code. For the remaining tasks, use the following commands:
python gen_qa_eval.py
cd /path/to/characterbot/test/style_transfer_eval
python bleu.py
python rouge.py
python style_matching.py
Parts of this project is built upon the following open-source libraries:
- transformers (Apache 2.0)
- peft (Apache 2.0)
- LLaMA-Factory (Apache 2.0)