Nuvepro - Task Intelligence for the Enterprise
OpenAI· Data Science· San Francisco

Data Scientist, Preparedness

Comp$347K – $400K

Classified Tasks (24)

Automate 0%Augment 88%Human-Only 13%

Augment (21)

AI assists, human decides

Identify catastrophic risks related to frontier AI models.

analytical

Track catastrophic risks related to frontier AI models.

operational

Monitor evolving capabilities of frontier AI systems with attention to misuse risks.

technical

Predict evolving capabilities of frontier AI systems relevant to misuse risks.

analytical

Build mitigations that prevent extreme harms from AI systems.

technical

Evaluate mitigation systems, including classifiers and detection pipelines across domains (e.g., biosecurity, cybersecurity, and emerging risk areas).

analytical

Continuously improve mitigation systems to reduce harms.

technical

Structure rigorous analyses from ambiguous problem statements.

analytical

Translate analytical findings into actionable product and policy changes.

operational

Create mitigation intelligence and monitoring systems to detect issues early.

technical

Measure mitigation effectiveness over time.

analytical

Reduce over-blocking and under-blocking in mitigation systems.

technical

Diagnose false positives and false negatives through deep error analysis and root cause investigation.

analytical

Provide clear recommendations for mitigation adjustments.

communication

Build monitoring and measurement frameworks to track mitigation effectiveness across user segments and use cases.

technical

Identify trends in over-blocking versus under-blocking.

analytical

Quantify customer impact of mitigation decisions and failures.

analytical

Propose prioritized interventions to address mitigation issues.

analytical

Develop insights from customer feedback, complaints, and usage patterns to detect shifts in adversarial behavior and system failure modes.

analytical

Expand risk monitoring into new areas, including cybersecurity threats and model loss-of-control or sabotage scenarios.

operational

Communicate results to technical and executive stakeholders with decision-ready metrics and clear tradeoffs.

communication

Human-Only (3)

Requires human judgment

Prepare for catastrophic risks related to frontier AI models.

operational

Ensure concrete procedures, infrastructure, and partnerships exist to mitigate risks and safely handle development of powerful AI systems.

leadership

Partner with domain experts to expand and implement risk monitoring.

leadership

Job description

Data Scientist, Preparedness | OpenAI Careers ## Data Scientist, Preparedness Data Science - San Francisco Apply now(opens in a new window) **About the Team** The Preparedness team is an important part of the Safety Systems org at OpenAI, and is guided by OpenAI’s Preparedness Framework. Frontier AI models have the potential to benefit all of humanity, but also pose increasingly severe risks. To ensure that AI promotes positive change, the Preparedness team helps us prepare for the development of increasingly capable frontier AI models. This team is tasked with identifying, tracking, and preparing for catastrophic risks related to frontier AI models. The mission of the Preparedness team is to: 1. Closely monitor and predict the evolving capabilities of frontier AI systems, with an eye towards misuse risks whose impact could be catastrophic to our society 2. Ensure we have concrete procedures, infrastructure and partnerships to mitigate these risks and to safely handle the development of powerful AI systems Preparedness tightly connects capability assessment, evaluations, and internal red teaming, and mitigations for frontier models, as well as overall coordination on AGI preparedness. This is fast paced, exciting work that has far reaching importance for the company and for society. **About the Role** We’re hiring a Data Scientist to help build, evaluate, and continuously improve mitigations that prevent extreme harms from AI systems. This role is for an experienced, highly autonomous individual contributor who can take ambiguous problem statements, structure rigorous analyses, and translate findings into actionable product and policy changes. This position goes beyond “running evals.” You’ll help create mitigation intelligence and monitoring systems that enable OpenAI to detect issues early, measure effectiveness over time, and reduce both over-blocking (unnecessary friction) and under-blocking (missed harm). ### ### **What You’ll Do** * Evaluate and improve mitigation systems, including classifiers and detection pipelines across domains (e.g., biosecurity, cybersecurity, and emerging risk areas). * Diagnose false positives and false negatives with deep error analysis, root cause investigation, and clear recommendations for mitigation adjustments. * Build monitoring and measurement frameworks to track mitigation effectiveness over time and across user segments and use cases. * Identify trends in over-blocking vs. under-blocking, quantify customer impact, and propose prioritized interventions. * Develop insights from customer feedback, complaints, and usage patterns to detect shifts in adversarial behavior and system failure modes. * Expand risk monitoring into new areas, including cybersecurity threats and model loss-of-control or sabotage scenarios, in partnership with domain experts. * Communicate results to technical and executive stakeholders with crisp narratives, decision-ready metrics, and clear tradeoffs. **You might thrive in this role if you are:** * An autonomous operator: you can take a problem statement and independently structure the analysis end-to-end. * Strong at executive-ready communication: concise, clear, and outcome-oriented. * Skilled in turning analysis into productable changes: you’re comfortable influencing across functions to drive mitigation improvements. ### **Qualifications** * Significant experience in data science or applied analytics in high-stakes domains (e.g., security, trust & safety, abuse prevention, fraud, platform integrity, or reliability). * Strong foundations in experimentation, causal thinking, and/or observational inference; ability to design robust measurement under imperfect data. * Fluency in SQL and Python (or equivalent) for analysis, modeling, and building monitoring workflows. * Experience building metrics, dashboards, and operational monitoring that meaningfully changes outcomes (not just reporting).
Source: OpenAI careers · scraped 2026-05-22
Apply at OpenAI