Sketch-MoMa: Teleoperation for Mobile Manipulator
via Interpretation of Hand-Drawn Sketches

1 Frontier Research Center, Toyota Motor Corporation, 2 Aichi Institute of Technology

(Posted on December 2024)

We propose Sketch-MoMa: easy and intuitive teleoperation for mobile manipulators with hand-drawn sketches by combining task planning with VLMs and motion planning.

Abstract


To use assistive robots in everyday life, a remote control system with common devices, such as 2D devices, is helpful to control the robots anytime and anywhere as intended. Hand-drawn sketches are one of the intuitive ways to control robots with 2D devices. However, since similar sketches have different intentions from scene to scene, existing work needs additional modalities to set the sketches' semantics. This requires complex operations for users and leads to decreasing usability. In this paper, we propose Sketch-MoMa, a teleoperation system using the user-given hand-drawn sketches as instructions to control a robot. We use Vision-Language Models (VLMs) to understand the user-given sketches superimposed on an observation image and infer drawn shapes and low-level tasks of the robot. We utilize the sketches and the generated shapes for recognition and motion planning of the generated low-level tasks for precise and intuitive operations. We validate our approach using state-of-the-art VLMs with 7 tasks and 5 sketch shapes. We also demonstrate that our approach effectively specifies the detailed motions, such as how to grasp and how much to rotate. Moreover, we show the competitive usability of our approach compared with the existing 2D interface through a user experiment with 14 participants.

Motivaion

What are challenges for teleoperation of robots with hand-drawn sketches?



Method


Easy and intuitive teleoperation with simple and task-intended sketches.


Task Reliability


We validated our proposal in real-world setting.

Task Pick Place Move Pull Push Pick
&Place
Task
Success
7/10 7/10 9/10 4/10 0/10 7/10

We also validated tasks with preferences.

Task reliability with preferences

User Experiment


Our approach achieves high useabilities in some metrics.


Additional Website


Citation


@article{frc-sketch-moma,
    title={Sketch-MoMa: Teleoperation for Mobile Manipulator via Interpretation of Hand-Drawn Sketches}, 
    author={Kosei Tanada and Yuka Iwanaga and Masayoshi Tsuchinaga and Yuji Nakamura and Takemitsu Mori and Remi Sakai and Takashi Yamamoto},
    year={2024},
    journal={arXiv},
}

Notification


The project page was solely developed for and published as part of the publication, titled ``Sketch-MoMa: Teleoperation for Mobile Manipulator via Interpretation of Hand-Drawn Sketches'' for its visualization. We do not ensure the future maintenance and monitoring of this page.

Contents might be updated or deleted without notice regarding the original manuscript update and policy change.

This webpage template was adapted from DiffusionNOCS and SG-Init -- we thank Takuya Ikeda and Takayuki Kanai for additional support and making their source available.