AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
Published in International Conference on Learning Representations (ICLR), 2025
Recommended citation: Duan, J., Pumacay, W., Kumar, N., Wang, Y. R., Tian, S., Yuan, W., Krishna, R., Fox, D., Mandlekar, A., & Guo, Y. (2025). AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation. International Conference on Learning Representations (ICLR). https://arxiv.org/abs/2410.00371
A vision-language model for detecting and reasoning over failures in robotic manipulation tasks.
Authors: Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar, Yijie Guo