Microsoft has released a robotic system that enables artificial intelligence models such as ChatGPT to interact with the real world. The system employs an experimental framework to control robots and drones using ChatGPT’s language capabilities.
The discovery was made in a research paper titled “ChatGPT for Robotics: Design Principles and Model Abilities,” written by Microsoft Autonomous Systems and Robotics Group’s Sai Vemprala, Rogerio Bonatti, Arthur Bucker, and Ashish Kapoor.
The paper describes a set of design principles that can be used to help language models solve robotics tasks. Special prompting structures, high-level APIs, and human feedback via text are examples of these.
ChatGPT can normally write special code that controls robot movements using natural language commands. A human then reviews the results and makes any necessary adjustments until the task is completed successfully. ChatGPT, on the other hand, according to Microsoft, unlocks a new robotics paradigm by allowing a (potentially non-technical) user to sit in the loop, providing high-level feedback to the large language model (LLM) while monitoring the robot’s performance.
Microsoft claims ChatGPT can generate code for robotics scenarios by adhering to its set of design principles. We use the LLM’s knowledge to control different robot form factors for a variety of tasks without any fine-tuning.
Microsoft claims that in order to accomplish this, it must first define a set of high-level robot APIs or a function library. Then it wrote a text prompt for ChatGPT that describes the task goal while also explicitly stating which high-level library functions are available. Following that, the code output is evaluated, and finally, deployment occurs.
The sources for this piece include an article in ArsTechnica.