AGI Advance: Weekly AI & AGI Insights (July 22, 2025)

Turing Staff

23 Jul 2025•3 mins read

LLM training and enhancement

Stay ahead with AGI Advance

LLM training and enhancement

Welcome to AGI Advance, Turing’s weekly briefing on AI breakthroughs, AGI research, and industry trends.

This week, we explore how browser-native agents are redefining automation by reasoning over live web state and learning from interaction, not scripts. We also revisit the four foundational capabilities needed to reach ASI, spotlight a new safety evaluation benchmark, and highlight fresh research on communication-efficient language models and proactive LLM collaboration frameworks.

What we're thinking

This week, we’ve been exploring browser-native agents—LLM-powered systems designed to operate inside real-world web interfaces, not sandboxed environments.

Here’s what’s emerging:

Static selectors no longer cut it: Static selectors and brittle automations are being replaced by agents that reason over live browser state and act based on semantic intent.
Training data comes from interaction, not scripts: Instead of relying on hand-scripted examples, agents are increasingly learning from real user behavior, including clicks, scrolls, and UI transitions, to build feedback loops that reflect real-world usage.
The web is now a programmable interface: Multi-step task planning, verification, and memory persistence are turning the web into a dynamic substrate for agent training—bridging perception, reasoning, and action.

As real-world environments go agentic, the modern browser may become the most powerful training ground for goal-driven intelligence.

What we're saying

🗣️Jonathan Siddharth, Founder & CEO:

“The road to ASI runs through four pillars: multimodality, reasoning, tool use, and coding.”

In a recent post, Jonathan outlined the foundational capabilities frontier models must master to reach artificial superintelligence. From multimodal understanding to planning, from tool invocation to self-improvement via code—these aren’t nice-to-haves, they’re necessities.

At Turing, we’re focused on helping leading labs advance along all four dimensions—because solving ASI is key to solving the world’s hardest problems.”

What we're reading

ROSE: Toward Reality-Oriented Safety Evaluation of Large Language Models
This paper introduces ROSE, a new safety evaluation framework for LLMs that generates adversarial prompts using multi-objective reinforcement learning. Unlike static benchmarks or prior RFT-based methods, ROSE emphasizes topic-level diversity and contextual realism, leading to more varied and effective attacks. It outperforms existing approaches on key metrics like topic-Diversity (topic-D%) and F1%, achieving a +30% improvement in integrated safety scores across state-of-the-art models like GPT-4o, Gemini-2, and Qwen-Turbo. The framework also powers the new ROSEset dataset—36,000+ prompts—designed for high-coverage, reality-aligned adversarial testing.
Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission
This paper introduces FedHLM, a federated learning framework that makes hybrid language models (HLMs) more communication-efficient and reduces LLM offloading by training personalized uncertainty thresholds and enabling peer-to-peer token reuse. It achieves 95%+ reduction in LLM transmissions, with 94% of tokens resolved locally and 93.2% inference accuracy, approaching centralized baselines with far lower communication overhead.
COLLABLLM: From Passive Responders to Active Collaborators
This paper introduces COLLABLLM, a training framework that equips LLMs with multiturn awareness, shifting them from reactive responders to proactive collaborators. By simulating conversations and optimizing for multiturn-aware rewards, COLLABLLM improves task success, conversational efficiency, and interactivity. Across document editing, coding, and math tasks, it outperforms baselines with an 18.5% improvement in task performance, 13.3% boost in efficiency, and 46.3% higher interactivity. In a user study with 201 participants, it also increased user satisfaction by 17.6% while cutting interaction time by 10.4%.

Where we’ll be

Turing will be at two major AI conferences in the coming months—join us to discuss the future of AGI:

KDD 2025 [Toronto, ON, Canada | Aug 3 – 7]
The ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) focuses on innovative research in data mining, knowledge discovery, and large-scale data analytics.
COLM 2025 [Montreal, Canada | Oct 7 – 10]
The Conference on Language Modeling (COLM) aims to create a community of researchers with expertise in different disciplines, focused on understanding, improving, and critiquing the development of LM technology.

If you’re attending, reach out—we’d love to connect and exchange insights!

Stay ahead with AGI Advance

Turing is leading the charge in bridging AI research with real-world applications. Subscribe to AGI Advance for weekly insights into breakthroughs, research, and industry shifts that matter.

[Subscribe & Read More]