Nuvepro - Task Intelligence for the Enterprise
OpenAI

Backend Software Engineer (Evals) San Francisco

Comp$230K – $385K

Classified Tasks (16)

Automate 0%Augment 81%Human-Only 19%

Augment (13)

AI assists, human decides

Prototype rapidly while prioritizing long-term quality and reliability when crafting products

technical

Create reusable solutions and patterns that can be applied across diverse domains within OpenAI

technical

Leverage OpenAI technologies (public and pre-released) to implement support automation solutions

technical

Design and build an evals infrastructure that measures the quality of OpenAI’s support automation

technical

Design eval pipelines that are reliable, reproducible, and extendable

technical

Build the infrastructure for continuous eval monitoring frameworks, including regression and drift monitoring

technical

Construct robust golden datasets for use in evals and monitoring

technical

Build feedback loops that strengthen and improve support automation systems

technical

Design, build, and maintain backend services and APIs to support intelligent automation and knowledge systems

technical

Integrate and structure data across internal platforms, transforming it into formats optimized for downstream systems and AI workflows

technical

Own the full development lifecycle of new backend systems and internal platform capabilities

technical

Build backend systems with scale and maintainability in mind while rapidly iterating on new ideas

technical

Build robust systems and backend services that enable creation, access, and application of knowledge across OpenAI

technical

Human-Only (3)

Requires human judgment

Develop an ecosystem of automation products that empower colleagues and drive impact

leadership

Collaborate closely with data science, research, and engineering teams to integrate OpenAI models into high-leverage workflows

communication

Work closely with Data Science and Research partners to design and build evals at scale

communication

Job description

--- BEGIN UNTRUSTED EXTERNAL CONTENT (source: https://openai.com/careers/backend-software-engineer-(evals)-san-francisco/) --- Skip to main contentResearchProductsBusinessDevelopersCompanyFoundation(opens in a new window)Log inTry ChatGPT(opens in a new window)ResearchProductsBusinessDevelopersCompanyFoundation(opens in a new window)Backend Software Engineer (Evals) | OpenAICareersBackend Software Engineer (Evals) Support Automation - San Francisco and SeattleApply now(opens in a new window)About the TeamThe Support Automation team at OpenAI scales the organization by applying cutting-edge AI models to real-world challenges, automating and enhancing work across the organization. From customer operations to engineering, we develop an ecosystem of automation products that empower our colleagues and drive impact. We're passionate about crafting products that serve those around us, blending rapid prototyping with a focus on long-term quality and reliability. By creating reusable solutions, we create patterns that can be applied across diverse domains within OpenAI.TLDR: this team leverages OpenAI technology to improve OpenAI, and you’ll have the opportunity to leverage the full extent of our tech (both public and pre-released) to accomplish this mission.About the RoleWe’re looking for a Backend Software Engineer with experience working in ML/LLM-heavy domains to help to design and build an evals infrastructure that measures the quality of OpenAI’s support automation. This is a deeply technical and highly cross-functional role where you’ll build robust systems and backend services that serve as the foundation for how knowledge is created, accessed, and applied across OpenAI. The role will especially focus on working closely with Data Science and Research partners to design and build evals at scale.In this role, you will:Design eval pipelines that are reliable, reproducible, and extendableBuild the infrastructure for continuous eval monitoring frameworks (regression/drift monitoring, building robust golden datasets) along with feedback loops that ultimately strengthen support automationDesign, build, and maintain backend services and APIs to support intelligent automation and knowledge systemsIntegrate and structure data across internal platforms, transforming it into formats optimized for use by downstream systems and AI workflows.Collaborate closely with data, research, and engineering teams to integrate OpenAI models into high-leverage workflowsOwn the full development lifecycle of new backend systems and internal platform capabilitiesBuild with scale and maintainability in mind, while rapidly iterating on new ideasYou might be a great fit if you have:4+ years of backend engineering experience at product-driven companies (excluding internships)Proficiency in backend technologies. Our tech stack includes Python, FastAPI, and PostgresExperience designing and scaling distributed systems, APIs, or data processing pipelinesHave experience building AI agents or applications, including designing evals and improving performance through prompting or scaffoldingAre familiar with evaluation methods for LLMs and have worked with patterns like multi-agent workflows, tool use, or long context.Experience creating production evals and/or measuring performance of ML/LLM models at scaleA pragmatic mindset. You’re comfortable shipping iteratively while building toward a long-term visionAbout OpenAIOpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do
Source: OpenAI careers · scraped 2026-05-22
Apply at OpenAI