This repository contains the implementation for the paper "Distance Estimation for High-Dimensional Discrete Distributions", presented at AISTATS 2025.
Given two distributions P and Q over a high-dimensional domain
- Clone DistEsti:
git clone https://github.com/meelgroup/distesti cd distesti - Requirements:
- Python 3.x
- Python libraries:
numpy,gmpy2 - External Binaries: The tool relies on several external samplers and solvers. These are expected to be compiled and available in a
samplers/subdirectory or accessible in the systemPATH. See the Dependencies section below.
- Install Python Dependencies:
pip install numpy gmpy2 # Or using a virtual environment (recommended) # python3 -m venv distesti-venv # source distesti-venv/bin/activate # pip install numpy gmpy2
- Build/Configure External Dependencies:
- You will need to obtain and compile the external samplers/solvers listed below (e.g., UniGen, QuickSampler, CMSGen, etc.).
- (Optional: Create a script similar to Manthan's
configure_dependencies.shto automate this process). Ensure the compiled binaries are placed in asamplers/directory or are otherwise accessible.
DistEsti depends on:
- Python Libraries:
numpygmpy2
- External Samplers/Solvers: (Assumed to be placed in
samplers/)- UniGen/AppMC3
- QuickSampler (and its dependency
z3) - STS
- CMSGen
- SPUR
- (Optionally, for related functionalities or specific configurations mentioned in internal code:
sharpSAT,d4/Dsharp_PCompile)
(Optional: Activate virtual environment if used: source distesti-venv/bin/activate)
Sampler IDs:
- 1: UniGen3
- 2: QuickSampler
- 3: STS
- 4: CMSGen
- 5: AppMC3
- 6: SPUR
- 7: WCMS
Example Usage:
- Using
protoclon.py:Example:python protoclon.py --sampler1 <ID> --sampler2 <ID> --epsilon <float> --delta <float> --seed <int> <input1.cnf> [<input2.cnf>]
python protoclon.py --sampler1 4 --sampler2 4 --epsilon 0.5 --delta 0.4 --seed 123 tests/20_0_0.cnf tests/20_0_1.cnf
See python <script_name>.py --help for more options.
Sample benchmarks and generation scripts can be found in the tests/ directory.