The code in src-ipynb/data_prep.ipynb contains data preparation for the training split including removing the pot comments, and selecting for each post one comment that has the highest argumentative score according to the work of Gretz et al 2019.
Note:
- The full data can't fit in the zip file. Instead we provided a data sample in
sample-datafolder. - For stance-classification, target-identification, and argument-quality assessment models, you can download them from this anonymized link: xxxx and place them under the
sampl-datafolder
To train the baseline model. Execute the following command under the src-py folder:
CUDA_VISIBLE_DEVICES=0 python training_conclusion_and_ca_generation.py --train_data ../sample-data/preprocessed_train_conclusion_all.pkl --valid_data ../sample-data/conclusion_and_ca_generation/sample_valid_conclusion_all.pkl --output_dir ../sample-data/models/known-conc-model --train_bs=8 --valid_bs=8 --train_epochs=6 --premises_clm post --conclusion_clm title --counter_clm counter --max_source_length 512 --max_target_length 200 --unique_targets
- To train the conclusion generation model, execute the following command under the src-py folder:
CUDA_VISIBLE_DEVICES=0 python training_conclusion_and_ca_generation.py --train_data ../sample-data/preprocessed_train_conclusion_all.pkl --valid_data ../sample-data/sample_valid_conclusion_all.pkl --output_dir ../sample-data/models/conc-gen-model --train_bs=8 --valid_bs=8 --train_epochs=6 --premises_clm post --counter_clm title --max_source_length 512 --max_target_length 200 --unique_targets --masked_conclusion
- To generate conclusions for the posts for the pipeline-based approach, execute the code in ``notebooks/bart-generate-conclusions.ipynb``
-
To train the stance-classifier, follow the instructions in
notebooks/stance-classification.ipynb -
To train the conclusion-target identifier, follow the instructions in
notebooks/claim-target-identification.ipymb
To train the approach on generating the conclusion and counter in one sequence, execute the following command under the src-py folder:
CUDA_VISIBLE_DEVICES=0 python training_conclusion_and_ca_generation.py --train_data ../sample-data/preprocessed_train_conclusion_all.pkl --valid_data ../sample-data/sample_valid_conclusion_all.pkl --output_dir ../sample-data/models/pred-conc-model --train_bs=8 --valid_bs=8 --train_epochs=6 --premises_clm post --conclusion_clm title --counter_clm counter --conclusion_and_counter_generation --max_source_length 512 --max_target_length 512 --unique_targets
To train the model, run the following command under joint-model-two-decoders folder:
CUDA_VISIBLE_DEVICES=0 python train.py --output_dir ../sample-data/models/mt-v4.baseline_2/ --eval_steps 1000 --train_batch_size=4 --valid_batch_size=4 --max_input_length 256 --max_argument_target 256 --max_claim_target 32 > logging.log
For fine-tuning the alpha1 and alpha2 parameters, run the following command:
CUDA_VISIBLE_DEVICES=0 python fine_tune_parameters.py --output_dir ../sample-data/models/mt-v4.fine_tune/ --eval_steps 200 --train_batch_size 8 --train_size 5000 --eval_size 1000 --num_epoch 1
- To generate the baselines predictions, follow the instructions in the
src-ipynb/baseline_predictions.ipynb.
-
To generate predictions for the model, follow instructions in the
prompted-conclusion/jointly-prompted-conclusion-generation.ipynbnotebook. -
To generate predictions for the pipeline-based model with the stance-based ranking component, follow the instructions in
prompted-conclusion/pipelined-prompted-conclusion-generation.ipynb
- To generate the predictions from this model, follow the instructions in
joint-model-two-decoders/predicting_counters.ipynbnotebook
Code for training and evaluating the stance classifier on Kialo dataset can be found in notebooks/stance-classification.ipynb
To train and extract targets from the conclusions, follow the instructions in notebooks/claim-target-extraction.ipynb