Self-Questioning Language Models

Self-Questioning Language Models (SQLM): an asymmetric self-play framework where a proposer is given the topic and generates a question for a solver, who tries to answer it.

SQLM is built on top of the verl library.

Installation:

Please refer to the existing verl quickstart for installation : verl Installation

Example: Run SQLM on Qwen2.5-3B-Instruct on the Arithmetic task

Prepare data: All the train and test datasets can be found in the selfplay_data folder. The train data for self-play is simply the prompt duplicated many times, and you can generate it with the following command:

python ./examples/data_preprocess/selfplay.py --prompt_version arithmetic_v1

Run Training:

Adjust the configuration in ppo_trainer.yaml to match your desired training configuration (number of gpus, batch size, etc.). To override this config somewhere else, see "Creating Custom Configurations"

python -m verl.trainer.main_ppo exps="[grpo,multiply_selfplay,smallbs,majority]" trainer.experiment_name=debug

Creating Custom Configurations

We use an extensible config setup, allowing you to override default configurations for specific tasks/jobs.

To define a custom configuration, create a new yaml file in verl/trainer/config/exps. NOTE: you MUST include # @package _global_ at the beginning of the file in order to override other configs.

To use different configuration files, simply add them to the exps="[...]" argument to verl.trainer.main_ppo. Note: configurations are applied from left-to-right order, so configs to the right will override configs to the left!

Name		Name	Last commit message	Last commit date
Latest commit History 817 Commits
.github		.github
.vscode		.vscode
docker		docker
docs		docs
examples		examples
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
Notice.txt		Notice.txt
pyproject.toml		pyproject.toml
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Self-Questioning Language Models

Installation:

Example: Run SQLM on Qwen2.5-3B-Instruct on the Arithmetic task

Creating Custom Configurations

About

Uh oh!

Releases

Packages

Languages

License

lili-chen/self-questioning-lm

Folders and files

Latest commit

History

Repository files navigation

Self-Questioning Language Models

Installation:

Example: Run SQLM on Qwen2.5-3B-Instruct on the Arithmetic task

Creating Custom Configurations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages