Research, Post-Training Data

$350k - $475k • Remote • San Francisco

Posted 1mo ago

Job Location

San Francisco

Tech Stack

OpenAI Mistral Python Go PyTorch TensorFlow RLHF JAX LLMs RLAIF Preference Modeling Reward Learning Human-AI Collaboration

Remote Work Policy

Fully remote

About the job

Thinking Machines Lab is seeking researchers to bridge the gap between raw AI intelligence and useful, safe, and collaborative systems. This role focuses on post-training data research, combining human insight and machine learning techniques to capture and steer model behavior based on human preferences. You will be responsible for translating research ideas into actionable data through labeling and collection campaigns, understanding data quality science, and developing metrics to measure the impact of data and training interventions. The position also involves exploring new paradigms for human-AI interaction and scalable oversight, blending research, data operations, and technical implementation to advance human-centered AI systems. This role requires both fundamental research and practical engineering, making it ideal for individuals who enjoy deep theoretical exploration and hands-on experimentation.

Responsibilities

Design and execute data collection and synthesis strategies for post-training using human feedback, preference data, and synthetic examples.
Develop pipelines and frameworks for scalable, high-quality human labeling, model-assisted labeling, and synthetic data generation.
Research and model human preferences and behavior to improve model reasoning, truthfulness, and helpfulness.
Define, optimize, and interpret evaluations to measure the effectiveness of post-training interventions.
Design and evaluate metrics and benchmarks for data quality, alignment, and real-world impact.
Scale existing methodologies and develop new approaches for post-training.
Publish and present research, sharing code, datasets, and insights to advance the AI community.

Requirements

Strong engineering skills with the ability to contribute code and debug complex codebases.
Experience with data curation, human feedback, or synthetic data generation for large language models or similar systems.
Ability to design, run, and interpret experiments with scientific rigor.
Proficiency in Python and familiarity with at least one deep learning framework (e.g., PyTorch, TensorFlow, JAX).
Comfort with debugging distributed training and writing scalable code.
Bachelor’s degree or equivalent experience in Computer Science, Machine Learning, Physics, Mathematics, or a related discipline.
Clarity in written communication of complex technical concepts.
Strong grasp of probability, statistics, and ML fundamentals.
Experience managing or analyzing human data collection campaigns or large-scale annotation workflows.
Research or engineering contributions in alignment, data-centric AI, or human-AI collaboration.
Familiarity with synthetic data pipelines, active learning, or model-assisted labeling.

About thinkingmachines

View company profile