Description: Moving towards AI agents that can navigate in virtual environments and answer natural language questions ---
machine learning (3800) navigation (1315) deep learning (1253) computer vision (855) neural networks (237) reinforcement learning (165) georgia tech (72) embodiedqa (1) embodied question answering (1)
Embodied Question Answering is a new AI task where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange").
We are grateful to the developers of PyTorch for building an excellent framework. We thank Yuxin Wu for help with the House3D environment. This work was funded in part by NSF CAREER awards to DB and DP, ONR YIP awards to DP and DB, ONR Grant N00014-14-1-0679 to DB, ONR Grant N00014-16-1-2713 to DP, an Allen Distinguished Investigator award to DP from the Paul G. Allen Family Foundation, Google Faculty Research Awards to DP and DB, Amazon Academic Research Awards to DP and DB, AWS in Education Research grant