Real-Time Operator Takeover for Visuomotor Diffusion Policy Training

Anonymous authors

We present a Real-Time Operator Takeover (RTOT) paradigm that enables operators to seamlessly take control of a live visuomotor diffusion policy, guiding the system back to desirable states or providing targeted corrective demonstrations. Within this framework, the operator can intervene to correct the robot’s motion, after which control is smoothly returned to the policy until further intervention is needed. We evaluate the takeover framework on three tasks spanning rigid, deformable, and granular objects, and show that incorporating targeted takeover demonstrations significantly improves policy performance compared with training on an equivalent number of initial demonstrations alone. Additionally, we provide an in-depth analysis of the Mahalanobis distance as a signal for automatically identifying undesirable or out-of-distribution states during execution. Supporting materials, including videos of the initial and takeover demonstrations and all experiments, are available on the project website.

Real-Time Operator Takeover

Download Preprint

Expert Demonstrations

Example of initial demonstrations and takeover demonstrations on the Rice Scooping task.

Inital rice scooping demos

Takeover rice scooping demos #1

Takeover rice scooping demos #1

Experiments

We evaluated policies trained on only initial demonstrations and onces with takeover demonstrations

Rice Scooping Task

Pen-in-box Task

Trousers folding Task

All experiments:

Contact

  • Anonimized during submission