Adapt Your Body: Mitigating Proprioception Shifts in Imitation Learning

1Tsinghua University, 2Shanghai Qi Zhi Institute, 3Shanghai Artificial Intelligence Laboratory
Description of the image


Left: Including proprioception observation (BC-Full) in imitation learning can result in sever performance drop compared to RGB-only observation (BC-RGB). Our method NADA can bring better performance with proprioception observation. Right: By measuring the distributional shift along time, we find that including proprioception observation will result in higher shift in proprioception, while our method NADA mitigates such shift.

Overview Video


Real-Robot Results

Pick and Place

Open Locker

Abstract

Imitation learning models for robotic tasks typically rely on multi-modal inputs, such as RGB images, language, and proprioceptive states. While proprioception is intuitively important for decision-making and obstacle avoidance, simply incorporating all proprioceptive states leads to a surprising degradation in imitation learning performance. In this work, we identify the underlying issue as the proprioception shift problem, where the distributions of proprioceptive states diverge significantly between training and deployment.

To address this challenge, we propose a domain adaptation framework that bridges the gap by utilizing rollout data collected during deployment. Using Wasserstein distance, we quantify the discrepancy between expert and rollout proprioceptive states and minimize this gap by adding noise to both sets of states, proportional to the Wasserstein distance. This strategy enhances robustness against proprioception shifts by aligning the training and deployment distributions.

Experiments on robotic manipulation tasks demonstrate the efficacy of our method, enabling the imitation policy to leverage proprioception while mitigating its adverse effects. Our approach outperforms the naive solution which discards proprioception, and other baselines designed to address distributional shifts.

Noise-Augmented Distribution Alignment

Interpolate start reference image.

Strategies for incorporating proprioceptive observations (\(o_p\)) in robotic imitation learning. (a) NADA (Ours): A two-pass approach where proprioception shift (\(W_T\)) between initial training/rollout data guides optimized noise (\(\sigma^*\)) injection into the training set for robust policy learning. Compared Baselines: (b) Full Observations: Using all sensory inputs directly. (c) Pure RGB: Discarding proprioception entirely. (d) Dropout Proprioception: Randomly zeroing out proprioceptive inputs during training. Our method provides a systematic framework that not only mitigates the proprioception shift but also preserves valuable information in proprioceptive observations.