Member of Technical Staff - Post-Training and RL at xAI — task breakdown

Member of Technical Staff - Post-Training and RL

Comp$180,000 – $600,000

Classified Tasks (7)

Automate 0%Augment 71%Human-Only 29%

Augment (5)

AI assists, human decides

Work on post-training machine learning challenges, including developing and improving reward modeling systems

technical

Develop and implement preference optimization systems such as RLHF and DPO

technical

Apply reinforcement learning methods to improve models' reasoning, truthfulness, and real-world capabilities

technical

Perform hands-on engineering and research tasks to develop, train, and evaluate AI models

technical

Communicate concisely and accurately technical knowledge and results with teammates

communication

Human-Only (2)

Requires human judgment

Experiment with and push the boundaries of reinforcement learning and alignment methods

technical

Prioritize tasks to focus on highest-impact post-training and reinforcement learning projects

leadership

Job description

ABOUT xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. ABOUT THE ROLE: You will work on the most critical post-training and reinforcement learning challenges at any given time — including reward modeling, preference optimization (RLHF/DPO), and RL for improving reasoning, truthfulness, and real-world capabilities. You will get clarity on your first project before an offer. BASIC QUALIFICATIONS: You believe truth-seeking AI is the most important and challenging problem. You are obsessed about building incredibly useful models through post-training and RL techniques. You are a power user of AI models and eager to push the boundaries of what’s possible with reinforcement learning and alignment methods. If you previously worked on post-training, RLHF, or trained models used by millions of people it’s a big plus, but relevant experience is not required. You take pride in your work and thrive in meritocratic environments. COMPENSATION AND BENEFITS: $180,000 - $600,000 USD Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice .

Source: xAI careers · scraped 2026-05-22

Apply at xAI