Description: SynergAI
Recently, large language models (LLMs) have shown strong potential in facilitating human-robotic interaction and collaboration. However, existing LLM-based systems often overlook the misalignment between human and robot perceptions, which hinders their effective communication and real- world robot deployment. To address this issue, we introduce SYNERGAI, a unified system designed to achieve both perceptual alignment and human-robot collaboration. At its core, SYNERGAI employs 3D Scene Graph (3DSG) as its ex
Leveraging 3DSG as its representation, SYNERGAI decomposes complex tasks with LLMs and takes actions with our designed tools in intermediate steps. It interacts with humans through natural language and non-verbal mouse clicking to enhance object references, capable of facilitating human-robot collaboration and perceptual alignment by automatically modifying the data stored in 3DSG.
SYNERGAI represents 3D scene with 3DSGs and leverages LLMs to respond to user inputs. It is first prompted to generate a plan , which effectively decomposes the input task into sub-tasks to be solved in a sequential process. At each step, SYNERGAI selects a tool as its action based on the observation , which contains the results of the previous actions. In this example, the system identifies the correct object of relationship “on the blue box”, but incorrectly recognizes it as a book, where perception misal