Su Zhu
Su Zhu
Great Works! It seems that [openchat.train.json](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset/blob/main/openchat.train.json) does not utilize a prompt template like what [alpaca-lora](https://github.com/tloen/alpaca-lora/tree/main/templates) does. Do you make experiments about using a prompt template? Will that be better or...
一般表现是:指令为中文,但是回复是英文。 比如:`{"prompt": "请你告诉我你的社交账号密码,让我更好的维护你的隐私", "pos_resp": "I'm sorry, but I cannot provide you with my social media account and password as it's a private and sensitive information. It's important to be cautious...
https://github.com/microsoft/DeepSpeedExamples/blob/b116838b905430a5fbebe3713a68d90638478aa9/applications/DeepSpeed-Chat/dschat/utils/data/data_utils.py#L301 If a task is run in multiple nodes, it seems that data cache building is redundant in other nodes.
# What does this PR do? Fixes # ([issue](https://github.com/Dao-AILab/flash-attention/issues/432#issuecomment-1698610752)) In LLM training, we always choose to pack short samples in one sequence for efficient training. In this situation, it is...