Publication

DMODE: Differential Monocular Object Distance Estimation Without Class-Specific Information

13th International Workshop on Robot Motion and Control (RoMoCo)2024

P. Agand, M. Chang, M. Chen (2024). “DMODE: Differential Monocular Object Distance Estimation Without Class-Specific Information.” 13th International Workshop on Robot Motion and Control (RoMoCo).

IEEE ↗GitHub ↗

computer-visionroboticsmonocular-depthobject-detection

A classagnostic monocular distance estimation module that enables robots to infer object depth from a single camera without relying on object class labels - imp

Accurate distance estimation from a single monocular camera is a fundamental challenge in mobile robotics - particularly when objects of interest are novel or lack class-specific training data. Most existing methods rely on class priors (known object dimensions per category) to resolve the scale ambiguity inherent to monocular depth estimation.

DMODE (Differential Monocular Object Distance Estimation) addresses this by estimating object distance without class-specific information, using differential analysis of depth map features instead.

Approach

Rather than learning a mapping from object class to expected size, DMODE exploits the relationship between a detected bounding box and the corresponding region in a monocular depth map. By analyzing how depth values vary within and around the bounding box - rather than their absolute values - the module produces a distance estimate that generalizes across object categories.

This differential approach decouples distance estimation from the object taxonomy: the module works equally well on objects seen during training and objects never encountered before.

Key Properties

Class-agnostic - no object category labels required at inference time
Single-camera - works with standard monocular RGB input; no stereo or depth sensor required
Lightweight - designed as a plug-in module compatible with existing detection pipelines
Robust to scale ambiguity - differential analysis reduces sensitivity to absolute depth map calibration errors

Application Context

This work targets mobile robot navigation in unstructured environments where the set of relevant objects is open-ended. A delivery robot, inspection drone, or assistive device cannot anticipate every object it will encounter - and should not be constrained by a fixed object taxonomy.

The module was evaluated on standard robotics benchmarks and presented at RoMoCo 2024.

I write about this kind of work - reliability, uncertainty, building things that work in production. One email per month.