Leverage Turing Intelligence capabilities to integrate AI into your operations, enhance automation, and optimize cloud migration for scalable impact.
Advance foundation model research and improve LLM reasoning, coding, and multimodal capabilities with Turing AGI Advancement.
Access a global network of elite AI professionals through Turing Jobs—vetted experts ready to accelerate your AI initiatives.
Leverage Turing Intelligence capabilities to integrate AI into your operations, enhance automation, and optimize cloud migration for scalable impact.
Advance foundation model research and improve LLM reasoning, coding, and multimodal capabilities with Turing AGI Advancement.
Access a global network of elite AI professionals through Turing Jobs—vetted experts ready to accelerate your AI initiatives.
Graphical user interfaces (GUIs) play such an important role in how we engage with software, yet intelligent agents often overlook them. That's where UI-Vision comes in! Developed by ServiceNow researchers and collaborators, this exciting new benchmark is here to change the game. With Turing's support, this desktop-focused benchmark allows for thorough evaluation of AI agents as they navigate the diverse and dynamic world of software environments.
This teamwork showcases Turing’s commitment to enhancing AI infrastructure and paving the way for practical AGI applications. AGI is all about improving how models understand, interact with, and learn from software's visual workflows.
GUIs are visual interfaces—toolbars, icons, dropdowns—that make modern software intuitive. Unlike web-based environments with structured HTML or mobile screens optimized for touch, desktop GUIs are complex, inconsistent, and more challenging for agents to interpret.
Today’s agents often struggle with visual grounding, spatial reasoning, and dynamic actions like drag-and-drop. These limitations have slowed progress in building fully autonomous systems that can navigate software like humans.
These tasks help assess an agent’s perception, reasoning, and action capabilities, laying the foundation for more capable, adaptive AI.
Turing has been essential in crafting the high-quality training and evaluation data that powers UI-Vision. Our global annotation team, based primarily in India and Latin America, skillfully managed the most intricate aspects of developing this benchmark.
Key Contributions:
This human-in-the-loop effort reflects Turing’s belief that human intelligence is a core differentiator in building real-world AI systems.
UI-Vision enables structured, measurable evaluation of GUI agents across three levels of understanding: identifying elements, interpreting layouts, and predicting interactions.
Potential Impacts:
Limitations:
UI-Vision is currently an offline benchmark, meaning it doesn't evaluate real-time interactions or explore multi-agent collaboration. Future extensions could address these gaps.
UI-Vision is crucial in assessing and enhancing intelligent agents in complex desktop environments. Turing’s role in powering the annotation process with rigorous human oversight highlights our commitment to improving AI infrastructure, enabling practical applications, and boosting real-world model performance.
This demonstrates how Turing AGI Advancement and Turing Intelligence contribute to advancing AI, linking foundational research with applied enterprise impact.
Turing’s new benchmark evaluates how frontier vision-language models perform on realistic, high-complexity tasks in business and STEM domains.