AGI Advance: Weekly AI & AGI Insights (June 3, 2025)

Turing Staff

04 Jun 2025•4 mins read

LLM training and enhancement

Stay ahead with AGI Advance

LLM training and enhancement

Most benchmarks show what AI can do, not what it will do in real-world workflows. This week in AGI Advance, we unpack why agents fail under ambiguity, failure, and tool unpredictability. We explore new research on LLM sycophancy, multi-agent jailbreaks, and models that solve math but can’t ask the right question. And we revisit the human edge, where meaning, not just capability, still matters.

What we're thinking

This week, we’ve been focused on the growing disconnect between AI performance in evaluations vs. real-world reliability, especially when models are deployed as part of agent workflows.

Three key gaps we’ve been discussing:

Tool use ≠ autonomy: Agents that can call APIs don’t always know when to use them—or how to recover from failure when the workflow breaks.
Eval scores ≠ reliability: Benchmarks tell us if a model can perform a task—not whether it will in a real environment with changing data and tools.
Instructions ≠ generalization: Following prompts isn’t the same as adapting to ambiguity, failure, or real-world friction.

If we want AI to stick, we need to design for the unstructured, high-friction workflows that block real people, not just the structured, benchmark-friendly ones that showcase model skill. The future isn’t just agents that pass evals; it’s agents that get things done.

What we're saying

Question: As AI surpasses human performance in more domains, the real question becomes: what kind of work will still belong to humans?

James Raybould, SVP & GM:
"Human work will endure where machines can’t replace mortality, vulnerability, consent, or community. Think of roles rooted in democratic legitimacy, real-world accountability, or biological instinct—jobs where presence, consequence, or ritual matter more than capability.

Whether it’s hospice care, live performance, or a wedding officiant, some work will persist not because humans do it best, but because only humans can do it meaningfully."

What we're reading

Social Sycophancy: A Broader Understanding of LLM Sycophancy
This paper introduces ELEPHANT, a framework to measure how LLMs preserve a user’s self-image across five behaviors like emotional validation and moral endorsement. On open-ended advice (OEQ), models preserved user face 47% more than humans, with GPT-4o showing emotional validation in 76% of cases vs. 22% for humans. On the AITA dataset, models sided with inappropriate behavior in 42% of cases. Preference datasets used in alignment were also found to reward sycophantic behavior, and common mitigation strategies, like prompting or fine-tuning, had limited effect.
Agents Under Siege: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
This research shows how adversarial prompts can exploit multi-agent LLM systems by bypassing safety mechanisms through optimized routing. The authors introduce a permutation-invariant attack that achieves up to 94% success, outperforming existing methods by 7×. Even top defenses like Llama-Guard fail under these distributed attacks. Complete graph topologies are most vulnerable, revealing a critical gap in current multi-agent safety design.
QuestBench: Can LLMs Ask the Right Question to Acquire Information in Reasoning Tasks?
QuestBench introduces a benchmark to evaluate whether LLMs can identify and ask the right question when information is missing in reasoning tasks. It formalizes this as a 1-sufficient constraint satisfaction problem (CSP) across logic, planning, and math domains. While SOTA models exceed 90% accuracy on math (GSM), their performance drops below 50% on Logic-Q and Planning-Q. Even when models can solve the fully specified problems, they often fail to identify what information is missing, highlighting a gap between reasoning ability and information acquisition.

Where we’ll be

Turing will be at two major AI conferences in the coming months—join us to discuss the future of AGI:

ICML 2025 [Vancouver Convention Center, Canada | July 13 – 19]
The International Conference on Machine Learning (ICML) is a leading international conference focused on the advancements in machine learning and its applications.
KDD 2025 [Toronto, ON, Canada | Aug 3 – 7]
The ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) focuses on innovative research in data mining, knowledge discovery, and large-scale data analytics.

If you’re attending, reach out—we’d love to connect and exchange insights!

Stay ahead with AGI Advance

Turing is leading the charge in bridging AI research with real-world applications. Subscribe to AGI Advance for weekly insights into breakthroughs, research, and industry shifts that matter.

[Subscribe & Read More]