This is an updated source code for paper Classification with Costly Features as a Sequential Decision-Making Problem wrote by Jaromír Janisch, Tomáš Pevný and Viliam Lisý: paper.
This version is enhanced with multiple options, namely:
- lagrangian optimization of lambda
- possibility of choosing an average or hard budget
- working with missing features (nans in the training data)
- reweighting of the dataset
For Classification with Costly Features using Deep Reinforcement Learning version, go to the master branch.
Prerequisites:
- cuda capable hardware
- ubuntu 16.04
- cuda 8/9
- python 3.6 (numpy, pandas, pytorch 0.4)
Usage:
- use tools
tools/conv_*.pyto prepare datasets; read the headers of those files; data is expected to be in../data - pretrained HPC models are in
trained_hpc, or you can usetools/hpc_svm.pyto recreate them; they are needed in../data - run
python3.6 main.py [dataset] [target], choosedatasetfromconfig_datasets/ - set
-target_typetolambdaorcost, the latter automatically finds suitable lambda with lagrangian optimization (see this article) - set
-hard_budgetfor strict budget per sample (default is average budget) - run
python3.6 main.py --helpfor additional options - the run will create multiple log files
run*.dat - you can use octave or matlab to analyze them with
tools/debug.m - you can also evaluate the agent on the test set with
eval.py --dataset [dataset] --flambda [lambda]