Delivered a dataset of more than 200 Python notebooks generated from natural language prompts. Each notebook followed a standardized structure covering data loading, exploration, modeling, and artifact generation. The notebooks were designed for traceability, reproducibility, and CUJ (customer user journey) coverage.

The client needed a high-quality dataset of real-world, structured Python notebooks that would:
The goal was to standardize multi-step analytical workflows across variable tasks while ensuring output was correct, modular, and visually documented.
Turing implemented a rigorous notebook generation and review protocol with built-in traceability.
Task framing
Notebook structure
Artifact management
Reviewer protocol
This dataset directly powered the launch of a new natural-language-to-notebook generation feature in the client’s platform. Users can now enter a single prompt into the client’s notebook interface and receive a complete, multi-step data science workflow in response.
This capability is enabled by the prompt-response structures and labeled notebooks delivered by Turing.
Request a dataset of labeled Python notebooks built from real prompts and datasets with modular steps, artifacts, and CUJ mappings.
Request SampleEach notebook starts from a natural language task prompt and includes a complete, step-labeled ML workflow from data loading and exploration to modeling and evaluation.
Each step includes a markdown label, the prompt it answers, the libraries used, and a series of code cells.
The dataset spans classification, regression, clustering, time series forecasting, geospatial analysis, and statistical testing, mapped to real-world BigQuery tables across multiple domains.
Each step produces saved outputs such as .png charts, .json visualizations, and .parquet tables. All artifacts are verified and aligned to the step that generated them.
Yes. This dataset is ideal for training or benchmarking agents that translate natural language into multi-step data science workflows with modular logic.
All notebooks were authored and reviewed by human data scientists.
A standard mutual NDA. Turing provides the countersigned agreement within one business day.
Within three business days after NDA execution.
Request labeled, domain-grounded notebooks with stepwise reasoning, code outputs, and full artifact trails from natural language inputs.