This week’s edition explores a new frontier in training data. With the launch of Project Lazarus, we’re preserving the full operational history of defunct startups, including codebases, design docs, testing artifacts, and infrastructure traces to train agents on the messy, high-stakes reality of building real products. We also discuss how Mistral’s OCR 3 redefines document parsing for agents, AWS and UW unveil a skill-learning RL agent framework, and new research warns of homogenized LLM outputs even in open-ended tasks.
Last week, we launched Project Lazarus, a groundbreaking initiative to preserve the complete operational history of defunct companies as training data for frontier models. While today’s datasets are optimized for publication, instruction, or synthetic control, they fail to capture the texture of real work: incomplete specs, deadline-driven tradeoffs, and human judgment under pressure.
Here’s what we’re doing:
💡 The world doesn’t need more synthetic examples, it needs data grounded in reality. Project Lazarus is how we preserve it.
🎉 Databricks × Turing: OfficeQA for Grounded Enterprise Reasoning
Databricks released OfficeQA, a benchmark for long-context, cross-document reasoning on real-world PDFs, built with question contributions from Turing. The dataset spans 246 QA pairs over 89,000+ pages of U.S. Treasury Bulletins, testing AI agents on analytical depth, retrieval precision, and answer grounding. It sets a new bar for enterprise-relevant evaluation.
Turing is leading the charge in bridging AI research with real-world applications. Subscribe to AGI Advance for weekly insights into breakthroughs, research, and industry shifts that matter.
Partner with Turing to fine-tune, validate, and deploy models that learn continuously.