RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains

Published in arXiv preprint, 2026

Recommended citation: Wang, Y. R., Ung, C., Gubarev, E., Tan, C., Srinivasa, S., & Fox, D. (2026). RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains. arXiv preprint. https://arxiv.org/abs/2604.05226

A language-driven evaluation framework that lets users author executable robotic manipulation tasks from natural language, compiled into reproducible task specifications. Evaluation on language-defined task families reveals generalization failures not evident in fixed benchmarks, and task diversity scales through crowd contributions.

Authors: Yi Ru Wang, Carter Ung, Evan Gubarev, Christopher Tan, Siddhartha Srinivasa, Dieter Fox

Download paper here

Project Website