OpenAI Asks Contractors to Upload Real Past Work Samples to Benchmark AI Agents

Sebastian Hills
3 Min Read
Image Credit: Photo-Illustration: WIRED Staff; Getty Images

OpenAI is reportedly asking third-party contractors to upload real examples of professional work they performed in current or previous jobs so the company can use those deliverables to evaluate and benchmark the performance of its next-generation AI agents, according to documents obtained by WIRED and first published on January 9, 2026.

The initiative, facilitated in part through the training data company Handshake AI, requires contractors to provide two components for each task:

  • The original task request (e.g., instructions from a manager or client)
  • The actual “experienced human deliverable” (the concrete output they produced, such as a Word document, PDF, PowerPoint, Excel file, image, or code repository)

OpenAI’s instructions emphasize that submissions must reflect “real, on-the-job work” that the contractor has “actually done,” with examples drawn from long-term or complex assignments (hours to days of effort). Contractors are directed to remove or anonymize any confidential, proprietary, personally identifiable, or material nonpublic information before uploading, and the company provides a ChatGPT-based “Superstar Scrubbing” tool to assist with redaction.

Also Read: OpenAI Looks for New Leader to Spot and Stop AI Dangers- Head of Preparedness

The goal is to create realistic, high-quality human baselines against which future AI agents can be measured for progress toward automating economically valuable office tasks, a key milestone on OpenAI’s path to AGI. The documents describe hiring contractors “across occupations” specifically to collect these real-world tasks modeled after their full-time roles.

Legal experts quoted in the WIRED report, including intellectual property attorney Evan Brown of Neal & McDevitt, warned that the approach carries significant risk. AI labs relying on contractors to self-redact sensitive material could face trade secret misappropriation claims. In contrast, contractors themselves might violate prior nondisclosure agreements or expose confidential information from former employers, even after scrubbing.

An OpenAI spokesperson declined to comment on the report when contacted by WIRED. No public statement from the company has contradicted the reporting as of January 11, 2026.

The revelation adds to ongoing scrutiny of how frontier AI labs source high-quality, real-world data for agent evaluation and improvement. Similar practices have been used by other companies, but OpenAI’s scale and emphasis on “actually done” professional outputs have raised particular concerns about confidentiality and liability.

The full WIRED investigation, based on internal OpenAI documents and Handshake AI records, remains the primary source. Coverage has since appeared in TechCrunch, DNYUZ, and various tech aggregators, all aligning with the original January 9 report. No retractions, corrections, or official denials have surfaced.

Share This Article
notification icon

We want to send you notifications for the newest news and updates.