Technical Program Manager, Safeguards (Infrastructure & Evals) at Anthropic — task breakdown

Technical Program Manager, Safeguards (Infrastructure & Evals)

Classified Tasks (22)

Automate 0%Augment 64%Human-Only 36%

Augment (14)

AI assists, human decides

Own operational health and forward momentum of the Safeguards Infrastructure and Evals stack

operational

Execute and manage the post-mortem process for incidents

administrative

Track incidents across the organization, including those owned by partner teams such as Inference

operational

Ensure post-mortems are written and documented for all incidents

administrative

Close the loop on post-mortem action items by tracking and ensuring completion

administrative

Build and maintain tracking and reporting systems to monitor SLO compliance

technical

Surface SLO breaches and reliability issues to relevant stakeholders

communication

Maintain and update runbooks to keep them accurate and actionable

technical

Surface recent incidents and failures during ops reviews

communication

Analyze and bring visibility to reliability trends

analytical

Own program management for platform migrations and larger infrastructure projects

operational

Coordinate eval-platform improvement initiatives and related cross-team dependencies

operational

Track progress, identify blockers, and remove impediments for platform and infrastructure projects

operational

Partner with engineering leads to review and improve runbooks and incident procedures

operational

Human-Only (8)

Requires human judgment

Drive reliability by owning the incident-response process

operational

Define service-level objectives (SLOs) for safety-critical pipelines in partnership with Safeguards, Inference, and Cloud Inference teams

technical

Ensure incident ownership is unambiguous for critical areas (e.g., account-banning false positives, CSAM detection)

operational

Drive recurring Safeguards Engineering ops review cadence to keep the team informed and coordinated

leadership

Convene appropriate stakeholders and ensure the right people are in decision-making meetings

leadership

Triage operational incidents to assess severity and safety-criticality and prioritize response

operational

Judge and prioritize which issues require immediate action versus which can be deferred

leadership

Facilitate technical discussions with engineers to inform triage, prioritization, and reliability decisions

communication

Job description

About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the Role Safeguards Engineering builds and operates the infrastructure that keeps Anthropic's AI systems safe in production — the classifiers, detection pipelines, evaluation platforms, and monitoring systems that sit between our models and the real world. That infrastructure needs to be not just correct, but reliable : when a safety-critical pipeline goes down or degrades, the consequences can be serious, and they can be invisible until someone looks closely. As a Technical Program Manager for Safeguards Infrastructure and Evals, you'll own the operational health and forward momentum of this stack. Your primary responsibility is driving reliability — owning the incident-response and post-mortem process, ensuring SLOs are defined and met in partnership with various teams, and making sure that when things go wrong, the right people know, the right actions get taken, and those actions actually get closed out. Alongside that ongoing operational rhythm, you'll coordinate the larger platform investments: migrations, eval-platform improvements, and the cross-team dependencies that connect them. This role sits at the intersection of operations and program management. It requires genuine technical depth — you need to understand how these systems work well enough to triage effectively, judge what's actually safety-critical versus what can wait, and have informed conversations with the engineers building and maintaining them. But the core of the job is keeping the machine running well and the work moving. What You'll Do: Own the Safeguards Engineering ops review - Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures, bringing visibility to reliability trends, and making sure the right people are in the room when decisions need to be made. This is the heartbeat of how Safeguards Eng stays ahead of operational risk. Drive incident tracking and post-mortem execution - When incidents happen — and in this space, they happen regularly — you'll make sure they get followed through properly. That means tracking incidents across the organization (including those owned by partner teams like Inference), ensuring post-mortems get written, and most critically, making sure the action items that come out of them actually get done. Closing the loop on post-mortem actions is one of the highest-leverage things this role does. Establish and maintain SLOs with partner teams - Work with Safeguards Engineering teams and key partners — particularly Inference and Cloud Inference — to define service-level objectives for safety-critical pipelines. Then build the tracking and reporting that makes it possible to tell whether those SLOs are being met, and surface it when they're not. Maintain runbook quality and incident-ownership clarity - Safety-critical systems need clear playbooks for when things go wrong. Partner with engineering leads to keep runbooks accurate, actionable, and up to date — and ensure that ownership of incidents (including for areas like account-banning false positives and CSAM detection) is unambiguous so that nothing falls through the cracks during an active incident. Drive platform migrations and infrastructure projects - Own the program management for the larger

Source: Anthropic careers · scraped 2026-05-22

Apply at Anthropic