LeTFuser for Autonomous Driving with Multi-Task Learning
P. Agand, et al. (2023). “LeTFuser for Autonomous Driving with Multi-Task Learning.” CVPR Workshop on Autonomous Vehicles.
Endtoend autonomous driving architecture using transformerbased late fusion of camera and LiDAR modalities with multitask supervision for robust urban navigatio
LeTFuser (Late-fusion Transformer Fuser) is an end-to-end autonomous driving model presented at the CVPR 2023 Workshop on Vision-Centric Autonomous Driving. It extends transformer-based sensor fusion with multi-task learning objectives to improve robustness on urban driving scenarios.
Motivation
Early sensor fusion — merging camera and LiDAR features at the representation level — can lose modality-specific structure. Late fusion preserves this structure by processing each modality with its own encoder before merging. LeTFuser explores whether transformer-based late fusion, combined with multi-task supervision, can close the performance gap with early fusion while being more robust to sensor dropout.
Architecture
Each modality (RGB camera, LiDAR BEV projection) passes through a dedicated transformer encoder. The resulting feature sequences are concatenated and processed by a shared cross-attention decoder that learns to weight each modality based on context.
Multi-task supervision includes: waypoint regression for route following, semantic segmentation for scene understanding, and object detection for collision avoidance. The multi-task objective regularizes the shared representation, reducing overfitting to any single task.
Key Finding
Late fusion with transformer attention is competitive with early fusion approaches on clear-weather scenarios, and more robust under simulated sensor degradation (rain, partial occlusion, sensor noise). The multi-task objective provides meaningful regularization — single-task variants of LeTFuser show higher variance on held-out routes.
The codebase builds on the pagand/e2etransfuser repository, which contains both LeTFuser and DMFuser implementations.
I write about this kind of work — reliability, uncertainty, building things that work in production. One email per month.