avatar

Ph.D. Student
Department of Computer Science
National Yang Ming Chiao Tung University, Taiwan

Software Engineer, Pixel Camera Team, Google

Email: [email protected]

CV / Google Scholar / LinkedIn / GitHub / Personal Blog

About Me

Hi! I’m Jie-Ying (Jay) Lee. I am a Ph.D. student in Computer Science at National Yang Ming Chiao Tung University, advised by Prof. Yu-Lun Liu. I am also a Software Engineer on Google’s Pixel Camera Team, where I develop on-device algorithms for camera.

I received my B.S. in Computer Science from National Yang Ming Chiao Tung University. During my undergraduate studies, I was an exchange student at ETH Zurich.

In Summer 2024, I interned with Google’s Pixel Camera Team, where I integrated the Segment Anything Model (SAM) for mobile devices, hosted by Yu-Lin Chang and Chung-Kai Hsieh. My industry experience also includes positions as an R&D Intern at Microsoft and a Backend Engineer Intern at Appier.

I am actively seeking research collaborations.

Outside of work and research, I enjoy badminton, dance, and photography.

Research Interest

News

Publications

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery teaser image ▶ Hover / Tap
Creating large-scale, photorealistic 3D urban scenes traditionally requires expensive 3D scanning and manual annotation. We present Skyfall-GS, a novel framework that synthesizes city-block scale environments by combining satellite imagery with diffusion models, enabling real-time exploration without costly 3D annotations.
LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal teaser image ▶ Hover / Tap LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal hover image
LightsOut, a diffusion-based outpainting framework tailored to enhance SIFR by reconstructing off-frame light sources by leveraging a multitask regression module and LoRA fine-tuned diffusion model to ensure realistic and physically consistent outpainting results.
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation teaser image ▶ Hover / Tap
This work presents See, Point, Fly (SPF), a training-free aerial vision-and-language navigation (AVLN) framework built atop vision-language models (VLMs), to consider action prediction for AVLN as a 2D spatial grounding task.
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting teaser image ▶ Hover / Tap
The approach introduces (1) depth-aware unseen mask generation for accurate occlusion identification, (2) Adaptive Guided Depth Diffusion, a zero-shot method for accurate initial point placement without requiring additional training, and (3) SDEdit-based detail enhancement for multi-view coherence.
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes teaser image ▶ Hover / Tap
SpectroMotion is presented, a novel approach that combines 3D Gaussian Splatting with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes and is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes.
BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes teaser image ▶ Hover / Tap
This paper presents a novel approach called BoostMVSNeRFs to enhance the rendering quality of MVS-based NeRFs in large-scale scenes, and identifies limitations in MVS-based NeRF methods, such as restricted viewport coverage and artifacts due to limited input views.