Robust Thermal Image Object Detection via Appearance-Guided Mixture of Experts
Published in WACVW, 2026
Thermal object detection must remain reliable as object and background appearance drifts across time of day, weather, and season. We tackle this challenge with an appearance-guided Mixture of Experts (MoE) that learns to route each image to a subset of specialized backbones. A self-supervised appearance encoder produces embeddings that drive a lightweight router; experts are pretrained on clusters of these embeddings to encourage specialization, and all experts share a single detection head to avoid the linear growth in parameters typical of ensembles. At inference, we adopt a tuning-free, compute-aware policy that activates the fewest experts whose cumulative routing probability exceeds a fixed threshold. Training is stabilized with complementary batch- and sample-level load-balancing losses that prevent expert collapse and promote diverse routing. On LTDv2 (natural long-term drift) and FLIR ADAS (simulated drift), our MoE achieves the highest peak accuracy and superior month-to-month ranking consistency, demonstrating that appearance-guided routing provides more reliable performance across diverse thermal conditions than monolithic scaling. The result is a practical and scalable detector that remains accurate under distribution shift and adapts its compute at test time.
Recommended citation: Andreas Aakerberg, Kamal Nasrollahi, Thomas B Moeslund (2026). "Robust Thermal Image Object Detection via Appearance-Guided Mixture of Experts", The IEEE/CVF Winter Conference on Applications of Computer Vision 2026: Real World Surveillance: Applications and Challenges, 6th
Download Paper
