Skip to content

Compact Decomposition of Irregular Tensors for Data Compression: From Sparse to Dense to High-Order Tensors (KDD 24)

Notifications You must be signed in to change notification settings

kbrother/Light-IT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compact Decomposition of Irregular Tensors for Data Compression: From Sparse to Dense to High-Order Tensors

This repository is the official implementation of Compact Decomposition of Irregular Tensors for Data Compression: From Sparse to Dense to High-Order Tensors, Taehyung Kwon, Jihoon Ko, Jinhong Jung, Jun-Gi Jang, and Kijung Shin, KDD 2024.

Requirements

Please see the requirements.txt

numpy==1.21.6
scipy==1.7.3
torch==1.13.1
tqdm==4.51.0

Input formats

Please download and check the datasets below for more details.

Sparse irregular tensors

It should be a pickle file (.pickle) that saves a dictionary. In the dictionary, 'idx' saves the indices of non-zero entries, and 'val' saves the values of them.

Dense irregular tensors

It should be a numpy array (.npy) where each entry contains a slice of an irregular tensor.

Running Light-IT and Light-IT++

The training processes of Light-IT and Light-IT++ are implemented in main.py.

Positional arguments

  • action: train_cp when running only Light-IT, train when running Light-IT and Light-IT++.
  • -tp, --tensor_path: file path for an irregular tensor. A file should be pickle files('.pickle') for sparse tensors and numpy files ('.numpy') for dense tensors.
  • -op, --output_path: output path for saving the parameters and fitness.
  • -r, --rank: rank of the model.
  • -d, --is_dense: 'True' when the input tensor is dense, 'False' when the input tensor is sparse.
  • -e, --epoch: Number of epochs for Light-IT.
  • -lr, --lr: Learning rate for Light-IT.
  • -s, --seed: Seed of execution.

Optional arguments (common)

  • -de, --device: GPU id for execution.
  • -b, --batch: Batch size for computation in Light-IT
  • -bnz, --batch_nz: Batch size for computing the loss (corresponding to the non-zero entries) of Light-IT. Please reduce the batch sizes (-bz, -bnz) when O.O.M occurs in GPU!

Optional arguments (Light-IT++)

  • -ea, --epoch_als: Number of epochs for Light-IT++.

Example command

  # Run Light-IT only
  python 23-Irregular-Tensor/main.py train_cp -tp ../input/23-Irregular-Tensor/usstock.npy -op output/usstock_r4_s0_lr0.01 -r 4 -d True -de 0 -lr 0.01 -e 500 -s 0

  # Run Light-IT and Light-IT++
  python 23-Irregular-Tensor/main.py train -tp ../input/23-Irregular-Tensor/usstock.npy -op output/usstock_r4_s0_lr0.01 -r 4 -d True -de 0 -lr 0.01 -e 500 -s 0 

Example output

  • usstock_r4_s0_lr0.01.txt: Saved the running time and fitness
  • usstock_r4_s0_lr0.01_cp.pt: Saved the parameters of Light-IT
  • usstock_r4_s0_lr0.01.pt: Saved the parameters of Light-IT++

Checking the compressed size of the parameters

Checking the compressed sizes of Light-IT and Light-IT++ are implemented in huffman.py.

Positional arguments

  • -tp, -r, -d, -de, -bz, -bnz: same with the cases of running Light-IT and Light-IT++.
  • -rp, --result_path: path for the '.pt' file.
  • -cp, --is_cp: "True" when using the output of Light-IT, "False" when using the output of Light-IT++

Example command

python huffman.py -tp ../data/23-Irregular-Tensor/cms.pickle -rp results/cms-lr0.01-rank5.pt -cp False -r 5 -de 0 -d False

Real-world datasets which we used

Name N_max N_avg Size (except the 1st mode) Order Density Source Download Link
CMS 175 35.4 284 x 91,586 3 0.00501 US government Link
MIMIC-III 280 12.3 1,000 x 37,163 3 0.00733 MIMIC-III Clinical Database Link
Korea-stock 5,270 3696.5 88 x 1,000 3 0.998 DPar2 Link
US-stock 7,883 3912.6 88 x 1,000 3 1 DPar2 Link
Enron 554 80.6 1,000 x 1,000 x 939 4 0.0000693 FROSTT Link
Delicious 312 16.4 1,000 x 1,000 x 31,311 4 0.00000397 FROSTT Link

About

Compact Decomposition of Irregular Tensors for Data Compression: From Sparse to Dense to High-Order Tensors (KDD 24)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages