Skip to content

google/sandwiched_compression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Sandwiched Compression

Copyright 2024 Google LLC

Google Illuminate Overview Podcast (3 min)

Color over Grayscale Codec

Grayscale Codec

Hires over Lowres Codec

Lowres Codec

MSE-Codec Optimized for LPIPS Loss

Lpips loss

Overview

Sandwiched compression augments a standards-based codec with pre-processor and post-processor neural networks. The primary goal is to adapt the codec to data and use-cases that are outside of the codec’s design targets. Examples include:

  • Transporting high-resolution images/video over codecs that can only transport low-resolution.
  • Transporting high-bit-depth (10, 12-bit) data over codecs that can only transport 8-bit.
  • Catering to applications where the data will be used to satisfy a sophisticated metric different from the codec’s native metric (typically PSNR): LPIPS, VMAF, SSIM
  • Transporting texture maps that will be used to render graphical models with view/lighting dependent metrics imposed.
  • Compressing data that will be used to enable further computations: ARCore accomplishing SLAM using features it calculates on compressed images from a wearable, cases of calculating depth from stereo, …

The code uses a differentiable codec proxy for the standard codec. Pre and post-processors are standard unets. Pre and post-processors are trained jointly and typically implement a message passing strategy between them.

In image/video compression scenarios a nice property of this work is that the networks need to generate images/video, i.e., visual data, which the standard codecs transport. We can hence check the generated images/video (termed bottlenecks) to get an idea of what the networks are trying to accomplish.

Release

The full sandwich image and video compression models are included in this release.

Manifest

  • distortion/distortion_fns.py: Distortion functions to use in distortion-rate optimization.
  • image_compression/encode_decode_intra_lib.py: Includes class EncodeDecodeIntra which contains the differentiable image proxy.
  • image_compression/jpeg_proxy.py: Includes class JpegProxy which supports EncodeDecodeIntra.
  • pre_post_models/unet.py: Simple unet model for pre-post-processors.
  • utilities/serialization.py: Checkpoint management routines.
  • compress_intra_model.py: Sandwich model for image compression.
  • compress_video_model.py: Sandwich model for video compression.
  • datasets.py: Basic loaders for tensorflow datasets.
  • image_codec_proxy.ipynb: Colab for basic image codec proxy usage.
  • sandwich_image_compression_grayscale_codec.ipynb: Colab for the grayscale codec scenario with example training and results.
  • sandwich_image_compression_lowres_codec.ipynb: Colab for the lowres codec scenario with example training and results.
  • sandwich_video_compression_grayscale_codec.ipynb: Colab for the grayscale codec scenario with example training and results.

Usage for Image Compression

Please see sandwich_image_compression_lowres_codec.ipynb and sandwich_image_compression_grayscale_codec.ipynb for two of the many scenarios discussed in the paper. The third colab shows the usage of the image codec proxy which you can try in your own sandwich implementations.

Colab: Sandwich Image Compression Lowres Codec

Colab: Sandwich Image Compression Grayscale Codec

Colab: Sandwich Image Compression Image Codec Proxy

The links above will open/run the colabs in the community server. That may be too slow for realistic training. Please consider using your own colab setups. For the latter, beyond the included software you will need tensorflow-datasets (with the 'clic' dataset downloaded) and mediapy. Your tensorflow installation may already have tensorflow-datasets. In that case 'clic' should automatically download as you run the colabs the first time. Please see the colabs for needed installations and links.

Usage for Video Compression

Please see sandwich_video_compression_lowres_codec.ipynb and sandwich_video_compression_grayscale_codec.ipynb for two of the many scenarios discussed in the paper. You will need to download the example dataset. This is a limited dataset compiled from legacy video sequences for standards-based compression. Please consider extending it significantly for training production models.

Example Video Dataset

Colab: Sandwich Video Compression Lowres Codec

Colab: Sandwich Video Compression Grayscale Codec

Usage for Example Video Dataset

dataset_path = '<path to downloaded dataset>/*'
def dataset_fn(
    batch_size: int, is_training: bool, take_count: int = 100
) -> tf.data.Dataset:
  return datasets.load_video_dataset(
      path=dataset_path,
      batch_size=batch_size,
      is_training=is_training,
  ).take(take_count)

# Samples are in Dict[str, tf.Tensor] format. Please see datasets._video_data for keys.
# Please see video colabs for use with the sandwich models.
train_batch_size = 3
train_dataset = dataset_fn(train_batch_size, True)  # Pull from train split.
eval_batch_size = 1
eval_dataset = dataset_fn(eval_batch_size, False)  # Pull from eval split.

References

Image and Video Compression

@article{guleryuz2024sandwiched,
  title={Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers},
  author={Onur G. Guleryuz and Philip A. Chou and Berivan Isik and Hugues Hoppe and Danhang Tang and Ruofei Du and Jonathan Taylor and Philip Davidson and Sean Fanello},
  journal={arXiv preprint arXiv:2402.05887},
  year={2024}
}

About

Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published