To use assistive robots in everyday life, a remote control system with common devices, such as 2D devices, is helpful to control the robots anytime and anywhere as intended. Hand-drawn sketches are one of the intuitive ways to control robots with 2D devices. However, since similar sketches have different intentions from scene to scene, existing work needs additional modalities to set the sketches' semantics. This requires complex operations for users and leads to decreasing usability. In this paper, we propose Sketch-MoMa, a teleoperation system using the user-given hand-drawn sketches as instructions to control a robot. We use Vision-Language Models (VLMs) to understand the user-given sketches superimposed on an observation image and infer drawn shapes and low-level tasks of the robot. We utilize the sketches and the generated shapes for recognition and motion planning of the generated low-level tasks for precise and intuitive operations. We validate our approach using state-of-the-art VLMs with 7 tasks and 5 sketch shapes. We also demonstrate that our approach effectively specifies the detailed motions, such as how to grasp and how much to rotate. Moreover, we show the competitive usability of our approach compared with the existing 2D interface through a user experiment with 14 participants.
Easy and intuitive teleoperation with simple and task-intended sketches.
We validated our proposal in real-world setting.
We also validated tasks with preferences.
Our approach achieves high useabilities in some metrics.
@article{frc-sketch-moma,
title={Sketch-MoMa: Teleoperation for Mobile Manipulator via Interpretation of Hand-Drawn Sketches},
author={Kosei Tanada and Yuka Iwanaga and Masayoshi Tsuchinaga and Yuji Nakamura and Takemitsu Mori and Remi Sakai and Takashi Yamamoto},
year={2024},
journal={arXiv},
}
The project page was solely developed for and published as part of the publication, titled ``Sketch-MoMa: Teleoperation for Mobile Manipulator via Interpretation of Hand-Drawn Sketches'' for its visualization. We do not ensure the future maintenance and monitoring of this page.
Contents might be updated or deleted without notice regarding the original manuscript update and policy change.
This webpage template was adapted from DiffusionNOCS and SG-Init -- we thank Takuya Ikeda and Takayuki Kanai for additional support and making their source available.