Positive-Unlabeled (PU) learning benchmark code. This repository contains runnable code for training, sweeping, and collecting results for the paper.
Paper: Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms (ICLR 2026)
Related Work: PU-Bench: A Unified Benchmark for Rigorous and Reproducible PU Learning (ICLR 2026) is a concurrent PU learning benchmark covering 19 methods across 9 datasets with a YAML-based configuration system. The two benchmarks are complementary in scope and design. See XiXiphus/PU-Bench.
- Python 3.10
- PyTorch, torchvision
- NumPy, PIL, pandas, scikit-learn
- (Optional for figures) matplotlib, seaborn
- CIFAR10: Set
--data_dirto the directory containing the CIFAR10 dataset (e.g./path/to/CIFAR10/). - IMAGENETTE: Set
--data_dirto the Imagenette data root. - Letter / USPS: Set
--data_dirto the directory containing the Letter or USPS data (as used in the paper). - Creditcard: Place
creditcard.csvindata/Creditcard/(or set--data_dirto that directory).
python -m train --data_dir /path/to/data/ --dataset CIFAR10 --algorithm uPU --hparams_seed 0 --trial_seed 0 --seed 0 --output_dir ./results/tmp/run1 --holdout_fraction 0.1 --skip_model_save --setting set1_1 --calibration Falsepython sweep.py launch --data_dir=./data/Letter/ --command_launcher multi_gpu --n_hparams_from 0 --n_hparams 1 --n_trials_from 0 --n_trials 3 --datasets Letter --algorithms uPU nnPU nnPU_GA VPU Dist_PU PUSB --setting set1_1 set2_1 --output_dir=./results/tmp --skip_model_save --steps 20000Use --command_launcher local for a single machine. Results are written under output_dir in per-run subdirectories.
After runs complete, aggregate results (and optionally output LaTeX):
python collect_results.py --input_dir ./results/tmp
# With LaTeX output:
python collect_results.py --input_dir ./results/tmp --latexRedirect the latter to a file if you need a .tex summary.
--setting: Predefined PU settings.set1_1–set5_1(andset1_2–set5_2) use one-sample style;set6_1–set10_1(andset6_2–set10_2) use two-sample style. Seetrain.pyandlib/misc.pyfor the exact definitions.
train.py: Single training run.sweep.py: Launch many jobs (algorithms × settings × trials).collect_results.py: Aggregate run results fromoutput_dir.core/: Algorithms and hyperparameter registry.data/: Dataset loaders and transforms.lib/: Utilities and reporting.