Pasta-1T captures billions of tokens generated from agentic interactions across diverse websites in several languages and categories. This comprehensive dataset provides a rich source of machine-validated web trajectories, reflecting heterogeneous, multicultural browsing behavior. This aims to be the largest public dataset, specifically of web trajectories.
The evolution of autonomous agents represents a fundamental shift in how we think about automation. While current pre-training datasets treat the web as a sequence of tokens, the agentic web is driven by behavior emerging from the interaction between the agent and its environment. This information captures the consequences of agentic actions rather than providing a static world description. Pasta-1T addresses this gap by providing a comprehensive dataset of validated web interaction trajectories, capturing the full context and behavioral patterns needed for training reliable agents. By making this resource available to the community, we want to help the emergence of more trustworthy and reliable open web agents.
Today's agents typically boast near-80% accuracy in academic benchmarks, which are not representative of real-world applications. Just as self-driving vehicles must meet a much higher standard than human-driven cars, autonomous agents must achieve a similarly high level of trustworthiness—99.9% or better—to be truly hands-off.
By releasing and supporting Pasta-1T, we aim to help the community bridge the usefulness gap so that agents can graduate from simple benchmarks to solving reliably complex tasks autonomously.
Specifically, this release includes:
All data processing scripts are open source and available on our repository. The dataset will be available through the open standard Minari. This is just the first of what we hope will be many contributions to the open community focused on improving AI agents.
For now, users must complete a form, as we need to track usage and potential ethical and copyright issues. After peer review, the dataset will be open-accessed through the proper academic channels.
The dataset consists of web trajectories generated by web agents interacting with websites. These interactions are collected through Silverstream AI's autonomous web agents in a setup that uses a language model-driven exploration policy to perform a self-defined curriculum of tasks within a browser environment. Each data point includes a self-defined task, its expected consequences, and the set of states and actions to achieve it. We ensure that the task is feasible and valuable and that the agent successfully completes it. This approach helps maintain consistency across tasks while capturing diverse interaction patterns.
We apply some filtering to ensure the dataset consists of high-quality and meaningful interactions. We discard trajectories that terminate too quickly without reaching a meaningful number of steps or state changes. This ensures that only trajectories demonstrating substantial interaction and complexity are retained to inform learning signals such as rewards.
Each trajectory is evaluated using an LLM as a judge, inspired by the Generalized Value Functions. The LLM judge assesses the trajectory against:
Trajectories are scored on a 1–5 scale:
Only trajectories scoring 4 or 5 are included in the dataset. To ensure non-triviality, tasks must require at least three meaningful and independent steps. The differential world model analyzes changes in the browser state at each step to confirm that interactions involve distinct and significant actions, avoiding superficial or repetitive tasks.
The dataset is a collection of episodes:
Each episode in the dataset includes the following components:
1. Title of the task: (e.g., "Buying socks")
2. Task Goal: The goal driving the interaction in a user story format (As a..., I..., so that...)
3. Success definition: The definition of a successful episode in the acceptance test criteria format (Given..., When...., Then...)
4. Trajectory: A sequence of transitions detailing the agent's interaction process.
A trajectory represents a step-by-step interaction between the agent and the webpage. A trajectory is a sequence of transitions, with each transition consisting of:
1. Observation:
2. Action:
Want to explore the Pasta-1T dataset and join the Silverstream AI community? Fill out this form to request access. For collaboration inquiries or more information, feel free to contact us.