AGI Advance: Weekly AI & AGI Insights (Mar 24, 2026)

Turing Staff
24 Mar 20263 mins read
LLM training and enhancement
AGI_Advance_Newsletter

This week we’re introducing Project Lazarus, a way for founders and engineering leaders to share private codebases and the operational artifacts around them (Jira tickets, PRDs, architecture docs, support threads) to help train frontier AI models on real-world software workflows, while earning payouts for eligible data. We’re also celebrating our contribution to Enterprise Ops Gym, ServiceNow’s new benchmark for evaluating enterprise agents across end-to-end, multi-system operations.

What we're doing

This week, we’re introducing Project Lazarus, a way for founders and engineering leaders to share private codebases and the operational artifacts around them (Jira tickets, PRDs, architecture docs, support threads) to help train frontier AI models in coding and real-world software workflows.

What you can share (and earn for):

  • Legacy codebases (COBOL, .NET, Java, etc.): typically $10K–$100K+ depending on complexity
  • Modern codebases: $1,000 per PR, up to $10K per codebase
  • Org artifacts (tickets, specs, docs, support threads): $50 per artifact when bundled with a codebase

Get started

What we're celebrating

🎉 Turing × ServiceNow

We’re excited to share that Turing contributed to EnterpriseOps-Gym, ServiceNow’s new enterprise agent benchmark. EnterpriseOps-Gym moves beyond short-horizon tool calls to evaluate end-to-end enterprise operations across realistic, multi-system workflows. 

Turing’s contributions included: 

  • 1,000+ enterprise prompts across 8 operational scenarios
  • 7–30 step execution paths to test long-horizon planning
  • Expert reference trajectories with structured reasoning, tool traces, and deterministic verification harnesses for correctness, compliance, and side effect isolation

Read the paper

What we're reading

  • Transformers are Bayesian Networks
    This paper argues that sigmoid-based transformers are inherently Bayesian networks, formally proving that each forward pass performs belief propagation (BP) on an implicit factor graph defined by the model’s weights. It shows that attention acts as AND (gathering evidence) and the FFN as OR (combining evidence), matching classical probabilistic inference. With specific weights, transformers can perform exact Bayesian inference, and this behavior is theoretically unique to the architecture.
    The paper further claims that hallucinations arise from lack of grounding, since ungrounded models lack a finite concept space required for verifiable inference, framing hallucination as a structural limitation rather than a scaling issue.
  • Measuring Progress Toward AGI: A Cognitive Framework
    Google DeepMind presents a cognitive framework for evaluating AGI, addressing the lack of clear, standardized metrics for measuring progress. It introduces a taxonomy of 10 cognitive faculties (including reasoning, memory, learning, attention, and problem solving) to decompose general intelligence into measurable components. The authors propose a three-step evaluation protocol: assess models on targeted, held-out cognitive tasks, compare performance to human baselines, and construct “cognitive profiles” that map strengths and weaknesses across faculties.
  • IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs
    OpenAI introduces IH-Challenge, an RL training dataset designed to improve how LLMs resolve conflicting instructions across roles (system, developer, user, tool). It focuses on adversarial scenarios like jailbreaks and prompt injection, where models must prioritize higher-trust instructions. Fine-tuning GPT-5-Mini on IH-Challenge improves robustness by +10% (84.1% → 94.1%), reduces unsafe behavior from 6.6% to 0.7%, and generalizes to unseen attacks and human red-teaming. The dataset uses programmatically verifiable tasks + adversarial generation to avoid reward hacking and shortcut learning.

Where we’ll be

🔹 ICLR 2026
📍  Rio de Janeiro, Brazil | 🗓️ April 23 - 27

ICLR focuses on cutting-edge research in deep learning, highlighting advancements in representation learning, optimization, and AI theory.

Stay ahead with AGI Advance

Turing is leading the charge in bridging AI research with real-world applications. Subscribe to AGI Advance for weekly insights into breakthroughs, research, and industry shifts that matter.

[Subscribe & Read More]

Ready to Optimize Your Model for Real-World Needs?

Partner with Turing to fine-tune, validate, and deploy models that learn continuously.

Optimize Continuously