Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
0-Prelinminary		0-Prelinminary
openai_grading		openai_grading
.gitignore		.gitignore
README.md		README.md

Repository files navigation

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace

Towards Open-Source Large Reasoning Models

News

The first version of LLaMA-O1 has been uploaded to HF now! Here We Come!

Supervised:

https://huggingface.co/SimpleBerry/LLaMA-O1-Supervised-1129

Base(Pretrain):

https://huggingface.co/SimpleBerry/LLaMA-O1-Base-1127

Supervised Finetune Dataset:

https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-SFT

Pretraining Dataset:

https://huggingface.co/datasets/SimpleBerry/OpenLongCoT-Pretrain-1202

RLHF is on the way! View our GitHub Repo:

https://github.com/SimpleBerry/LLaMA-O1

Our ongoing related researches:

https://huggingface.co/papers/2406.07394

https://huggingface.co/papers/2410.02884

https://huggingface.co/papers/2411.18203

GGUF:https://huggingface.co/Lyte/LLaMA-O1-Supervised-1129-Q4_K_M-GGUF

Online Demo (CPU-only): https://huggingface.co/spaces/SimpleBerry/LLaMA-O1-Supervised-1129-Demo

RoadMaps of LLaMA-O1

Marked Language of Long CoT (Done)
Pretrain Dataset (Done)
Supervised Dataset (Done)
PRM token rectifcation Dataset (Done)
Reinforcement Learning With Self-Play (Codes done, training)
Inference-time Reasoning Enhancement Frameworks (Codes done, Temporarily postponed)