
Microsoft Research has unveiled GroundedPlanBench, a new benchmark for evaluating vision-language models in robot manipulation tasks. This framework aims to improve the accuracy of planning actions and their spatial grounding in complex environments.

This article discusses the methodology behind creating evaluations for Deep Agents, focusing on data sourcing, metric definition, and targeted experiments to enhance agent performance.

Coca-Cola CEO James Quincey and former Walmart CEO Doug McMillon attribute their departures to the need for new leadership to navigate the AI transition. They emphasize the importance of understanding AI for future growth.