CQ-Bench

Dataset

We have 7 categories: random, political, ethical, religious, social, multiple and human, where humans means the dataset we use for human evaluation. For each category, there is original dataset, Task 1 (Attitude detection), Task 2 (Value selection). The original dataset only includes story, ground truth values, filtered values and contradictory values. The task 1 and task 2 dataset includes questions and gold labels.

For value extraction, since there is no option, we simply use original dataset.

Generating story

To generate story, run

cd story_generation
python generate_story_pipeline.py --output_file <output file name> --number <number of stories> --value_file <seed value values> --previous_file <if already generate stories>

To run validation, run:

python  validation.py --output_file <validation output> --story_file <generated story>

To generate dataset from story, run:

python organize_dataset.py --validation_file <validation file> --category <category>

python generate_dataset_from_original.py --category <category> --task <task> --value_file <value file>

Run experiments

To run experiments on attitude detection and value selection

python run_exps.py --task <task> --category <category> --model <model name> --prompt <zero/few> --reasoning

To run experiments on value extraction

python run_exps_open.py

To evaluate attitude detection and value selection

python evaluation.py --task <task> --category <category> --model <model name> --prompt <zero/few> --reasoning

To evaluate value extraction

python evaluate_open.py --category <category> --model <model name>

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
story_generation		story_generation
README.md		README.md
datasets.zip		datasets.zip
evaluation.py		evaluation.py
evaluation_open.py		evaluation_open.py
prompts_exps.py		prompts_exps.py
run_exps.py		run_exps.py
run_exps_open.py		run_exps_open.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CQ-Bench

Dataset

Generating story

Run experiments

About

Uh oh!

Releases

Packages

Languages

limenlp/CQ-Bench

Folders and files

Latest commit

History

Repository files navigation

CQ-Bench

Dataset

Generating story

Run experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages