Recent advancements in robotics have underscored the need for effective collaboration between humans and robots. Traditional interfaces often struggle to balance robot autonomy with human oversight, limiting their practical application in complex tasks like mobile manipulation. This study aims to develop an intuitive interface that enables a mobile manipulator to autonomously interpret user-provided sketches, enhancing user experience while minimizing burden. We implemented a web-based application utilizing machine learning algorithms to process sketches, making the interface accessible on mobile devices for use anytime, anywhere, by anyone. In the first validation, we examined natural sketches drawn by users for 27 selected manipulation and navigation tasks, gaining insights into trends related to sketch instructions. The second validation involved comparative experiments with five grasping tasks, showing that the sketch interface reduces workload and enhances intuitiveness compared to conventional axis control interfaces. These findings suggest that the proposed sketch interface improves the efficiency of mobile manipulators and opens new avenues for integrating intuitive human-robot collaboration in various applications.
Our ultimate goal is to create an interface that allows anyone, anywhere, at any time to operate a mobile manipulator without the need for special devices. We aim to facilitate intuitive communication of movement and grasping intentions, such as where to grasp or which direction to grasp.
Our solution is a sketch-based interface. We aim to develop an interface that effectively conveys the user's intentions for movement and grasping through simple sketches. We conducted two experiments: the first assessed users' tendencies in sketch instruction, while the second involved a user study comparing our interface with a traditional click-based interface across various tasks.
We defined 27 mobile manipulation tasks for our experiments. The 18 manipulation tasks were derived from the grasp taxonomy research and reinterpreted for a single-arm mobile manipulator with a gripper. The remaining tasks included movement instructions, view control, and complex sequenced tasks.
No. | Annotation (Task) | Taxonomy No. | Expt.1 | Expt.2 |
---|---|---|---|---|
1 | Squeeze an empty soda can | 1 | ✓ | ✓ |
2 | Hold the bags | 2 | ✓ | ✓ |
3 | Hold a bunch of flowers | 3 | ✓ | ✓ |
4 | Hold a disk | 12 | ✓ | ✓ |
5 | Plug in a plug | 13 | ✓ | ✓ |
6 | Hold a light bag | 15 | ✓ | × |
7 | Stick a plate into a dishwasher | 18 | ✓ | × |
8 | Hold a bowl | 19 | ✓ | × |
9 | Press the dish washer bottle | 54 | ✓ | × |
10 | Hold an open book | 71 | ✓ | × |
11 | Put in/take out the forks from a dishwasher | 2 | ✓ | × |
12 | Lift a pan | 2 | ✓ | × |
13 | Hold a cell phone | 12 | ✓ | × |
14 | Cut with scissors | 19 | ✓ | × |
15 | Press the drying machine button | 55 | ✓ | × |
16 | Press a screen button | 56 | ✓ | × |
17 | Lift up the switch | 65 | ✓ | × |
18 | Pick up card | 69 | ✓ | × |
19 | Move near an orange drawer | - | ✓ | × |
20 | Move and face a gray tray | - | ✓ | × |
21 | Look down to the right | - | ✓ | × |
22 | Place a snack on a shelf | - | ✓ | × |
23 | Place a snack on a white table | - | ✓ | × |
24 | Put a yellow ball into a blue box | - | ✓ | × |
25 | Pull a drawer approximately 10 cm | - | ✓ | × |
26 | Hand a snack to a person | - | ✓ | × |
27 | Stack cups in this order from top to bottom: blue, yellow, red | - | ✓ | × |
Investigate how individuals instruct a robot to perform 27 various tasks including manipulation and navigation. This experiment was approved by our company’s research ethics review board.
Participants interpreted the task using the instruction text and the figure on the left, and provided sketch instructions within the figure on the right, under a time constraint.
For grasping tasks, 71% of users drew a C-shaped symbol resembling a gripper.
For movement tasks, 86% of sketches used lines or arrows, effectively representing paths.
Compare our sketch-based interface to a click-based interface through 5 grasping tasks. This experiment was approved by our company’s research ethics review board.
Baseline (click-based UI): (1) View mode, (2) Navigation mode, (3) Arm mode, and (4) Gripper mode.
Users switch between control modes based on the target axis
and issue velocity commands for each axis.
This interface was inspired by this research.
Proposed (sketch-based UI): (1) View mode, (2) Navigation mode, (3) ~ (6) Manipulation mode.
In View mode, the screen can be scrolled in any direction to change the robot's viewpoint.
In Navigation mode, the robot moves along the sketch of the line drawn on the ground.
In Manipulation mode, first, click on the target object ((3)).
Then, in the enlarged screen where the segmentation results are highlighted ((4)),
sketch how to grasp the object using the C-shaped symbol ((5)).
After that, confirm the inference results regarding the grasping positoin and orientation
using the 3D viewer, make adjustments as necessary ((6))
and execute the robot's action commands.
Average results of NASA-TLX
Results of the questionnaire
These questions were rated on a 5-point Likert scale (-2: Strongly disagree, -1: Disagree, 0: Undecided, 1: Agree, 2: Strongly agree)
@inproceedings{10.5555/3721488.3721516,
author = {Iwanaga, Yuka and Tsuchinaga, Masayoshi and Tanada, Kosei and Nakamura, Yuji and Mori, Takemitsu and Yamamoto, Takashi},
title = {Sketch Interface for Teleoperation of Mobile Manipulator to Enable Intuitive and Intended Operation: A Proof of Concept},
year = {2025},
booktitle = {Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction},
pages = {193–202},
location = {Melbourne, Australia},
series = {HRI '25}
}
The project page was solely developed for and published as part of the publication, titled ``Sketch Interface for Teleoperation of Mobile Manipulator to Enable Intuitive and Intended Operation: A Proof of Concept'' for its visualization. We do not ensure the future maintenance and monitoring of this page.
Contents might be updated or deleted without notice regarding the original manuscript update and policy change.
This webpage template was adapted from DiffusionNOCS -- we thank Takuya Ikeda for additional support and making their source available.