Nuvepro - Task Intelligence for the Enterprise
Mistral· Research· Paris

Research Engineer, Data Infrastructure

Classified Tasks (19)

Automate 0%Augment 84%Human-Only 16%

Augment (16)

AI assists, human decides

Build specialized compute fabrics to power model development and training workloads

technical

Build specialized data fabrics to support large-scale model training and fine-tuning

technical

Design and build data lakes and metadata systems aimed at exabyte-scale architecture

technical

Develop a high-performance training platform for on-premise and cloud-native Kubernetes environments

technical

Migrate legacy scheduling systems to modern orchestration frameworks

technical

Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions

technical

Implement cloud-bursting capabilities to utilize global resources across clusters and regions

technical

Provision and configure distributed compute clusters to provide seamless researcher access to compute resources

operational

Design and implement decoupled control and data plane architectures for scalable systems

technical

Scale distributed compute and storage systems to meet operational capacity and performance goals

operational

Develop and maintain the internal training platform to enable model training and fine-tuning across Kubernetes and SLURM environments

technical

Implement production-grade data and training pipelines for model development workflows

technical

Implement and manage metadata and lineage systems to provide visibility and traceability across data and model pipelines

technical

Design and operate modern deployment workflows for cloud-native deployments to ensure platform scalability, reliability, and efficiency

operational

Enforce secure and governed data access controls for MLOps and research use cases

operational

Operate and manage large distributed compute fleets in production

operational

Human-Only (3)

Requires human judgment

Architect the backbone of the model training and fine-tuning infrastructure for frontier AI development

technical

Architect the transition to modern storage formats to handle large fine-tuning datasets and anticipated exabyte growth

technical

Participate in on-call rotations to support and troubleshoot critical training jobs

operational

Job description

About Mistral At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users. We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited. Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers. Mistral AI participates in the E-Verify program By applying, you agree to our Applicant Privacy Policy . Role Summary Research Engineer, Data Infrastructure The Data Infrastructure team at Mistral AI is architecting the backbone of our frontier model training and fine-tuning ecosystem. We are building the specialized compute and data fabrics required to power the development of world-class AI. Our vision is to operate some of the largest compute fleets in production and build data lakes and metadata systems with a roadmap toward exabyte-scale architecture. We are currently in the process of building a high-performance training platform designed for massive scale across both on-premise and cloud-native Kubernetes environments. We are leading a strategic transition from legacy scheduling to modern orchestration. With numerous clusters distributed across various regions, we are focussed on implementing sophisticated multi-cluster orchestration and cloud-bursting capabilities to better utilize our global resources and ensure our researchers have seamless access to compute wherever it resides. Our mission is to evolve our current systems into a platform that is as durable as it is flexible. Location: Paris / London (hybrid) or remote EU/UK with one hub day per month. About the Role This role focuses on building and operating the next generation of data infrastructure at Mistral AI. You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability. You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs. In this role, you will: Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions. Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth. Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments. Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity. Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient. You might thrive in this role if you: Have 4
Source: Mistral careers · scraped 2026-05-22
Apply at Mistral