Skip to content

SeewonChoi/CAREBench

Repository files navigation

StableMedBench

Dataset Access

The datasets used in this benchmark are protected, hence, need to be downloaded from respective sources.

Benchmark

For each TASK, run relevant files under process_data and benchmarks, the dataset would be created under data/{TASK}. Set DATA_DIR to data/{TASK} for each task.

Models

  • For classical models XGBoost and Random Forest, run classical/trainer --task {TASK}.

    Additionally for stability, run classical/stability --task {TASK}

  • For transformers GPT2, GPT2-AR and Mamba, run python trainer_binary.py --task {TASK}.

    The steps for reproducing the tokenizer is under tokenizer.

    Optionally, to pre-train the model, use python pretrain/trainer.py, and modify the loader in pretrain/trainer.py to load the dataset you want to pre-train on.

  • For LLMs, refer to the README.md in the llm directory. Note that we ran experiments on an Nvidia A100 80GB GPU, and the code is not optimized for other GPUs. Physionet policies for MIMIC dataset prevent using API providers such as OpenAI or Claude naively, refer here for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •