ChromaDistill : Colorizing Monochrome Radiance Fields with Knowledge Distillation

1Vision and AI Lab, Indian Institute of Science, 2Samsung R&D Institute India - Bangalore

WACV 2025

Interpolate start reference image.
(a) Overview of our method. Given input multi-view grey-scale views, the proposed approach “ChromaDistill” is able to generate colorized views which are 3D consistent. Two colorized novel-views (b) and (e) by I. Image-colorization baseline, II. Video-colorization baseline, and III. our approach on “playground” scene from LLFF [ 20 ] dataset. State-of-the-art colorization baselines generate 3D inconsistent novel-views as shown in zoomed-in regions in (c) and (d).

Abstract

Neural radiance field (NeRF) and Gaussian-Splatting based methods enable high-quality novel-view synthesis for multi-view images. The question arises: Can these representations generate colorized novel views when provided with monochromatic(grey-scale) inputs? Beyond its visual aesthetic significance in portraying worlds, color plays a pivotal role in downstream applications such as open-set scene decomposition. This research presents a method for synthesizing colorized novel views from input grey-scale multi-view images. Applying image or video-based colorization techniques to generated grey-scale novel views demonstrates artifacts arising from inconsistencies across views. Even training a radiance field network on colorized grey-scale image sequences fails to resolve the 3D consistency issue. We propose a distillation-based approach, leveraging knowledge from colorization networks trained on natural images and transferring it to the chosen 3D representation. Specifically, our method uses the radiance field network as a 3D representation and transfers knowledge from existing 2D colorization methods. This strategy introduces no additional weights or computational overhead to the original representation during inference. The experimental results demonstrate the superiority of our proposed method in generating high-quality colorized novel views for indoor and outdoor scenes, showcasing notable advantages in cross-view consistency compared to baseline approaches. Additionally, we illustrate the seamless extension of our method to a Gaussian-Splatting representation.

Further, we validate the efficacy of our approach in diverse applications, notably in the colorization of radiance field networks trained from two distinct sources: 1.) Infra-Red (IR) multi-view images and 2.) Legacy grey-scale multi-view image sequences.

Video

Leaves

Cake

Truck

Pasta

M60

Playground