Abstract
We propose a lightweight architecture for semantic segmentation in extremely low-light conditions. The model uses a two-branch design — a noise-aware backbone and a structure-preserving decoder — trained jointly on a curated dataset of paired low-light / well-lit scenes.
Approach
The key insight is that low-light scenes are not simply "dark" — they have a different statistical structure. Photon noise dominates, and the usual assumptions about local smoothness break down. We model this explicitly with a noise-aware feature extractor that learns to disentangle scene structure from sensor artifacts.
Results
Our model achieves 78.4 mIoU on the LL-Cityscapes benchmark at 62 FPS on consumer hardware, with only 12M parameters. That's a 4.2 point improvement over the previous SOTA at one-third the parameter count.