Python 3.7 and pytorch 1.9.1. More details are under requirements.txt, but some of them are redundant. //TODO
- Run
introduce_bias.pyto create csv files containing biased dataset - Run
train.pywithIF_TUNE_HYPER_PARAMset to True - Inspect the output result from it (require
WRITE_TEST_RESULT_TO_CSVset to True) - Test the selected model checkpoint using
train.pywithIF_TUNE_HYPER_PARAMandWRITE_TEST_RESULT_TO_CSVset to True
Because previously I run them at my local machine so I do not write any sys.args. They can be run directly (python introduce_bias.py, python train.py).
Setting it to true makes the script use validation data as test set, so it outputs acc on val set. Setting it to false makes the script use test set.
Full dataset from https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia (too less val data to be used for tuning)
The samples I selected from the original dataset: https://drive.google.com/drive/folders/1axoxRGx0hE61erdbbsvX4Vs8zzkYppIK?usp=sharing (Unfortuanately I do not have the seed... so not replicable). After downloading from the google drive/having another train-val split from the original dataset, please put all_images dir under the root dir.
CXR_0_0_csv contains the name of the images used in the experiment