Videet — AI Engineer · CV Researcher

Input

Predicted

78.4

mIoU

FPS

12M

Params

Abstract

We propose a lightweight architecture for semantic segmentation in extremely low-light conditions. The model uses a two-branch design — a noise-aware backbone and a structure-preserving decoder — trained jointly on a curated dataset of paired low-light / well-lit scenes.

Approach

The key insight is that low-light scenes are not simply "dark" — they have a different statistical structure. Photon noise dominates, and the usual assumptions about local smoothness break down. We model this explicitly with a noise-aware feature extractor that learns to disentangle scene structure from sensor artifacts.

Results

Our model achieves 78.4 mIoU on the LL-Cityscapes benchmark at 62 FPS on consumer hardware, with only 12M parameters. That's a 4.2 point improvement over the previous SOTA at one-third the parameter count.

Low-Light Semantic Segmentation

Abstract

Approach

Results