Find your next LLM, RAG & AI Agent engineering role
316+ open roles · 10+ companies hiring
316 open positions
Senior LLM Engineer
FeaturedOpenAI· San Francisco, CA
We are looking for a Senior LLM Engineer to fine-tune and deploy GPT-4 class models. You will work with PyTorch, Python, and Hugging Face to build training pipelines, and collaborate with the research team on prompt engineering and fine-tuning. Salary range: $180,000 - $260,000. This role is fully remote.
$180k - $260k
2d ago
RAG Engineer
FeaturedAnthropic· Remote - US
Build retrieval augmented generation (RAG) systems powering Claude-based products. Experience with LangChain, LlamaIndex, Pinecone, and vector databases required. Strong Python and TypeScript skills a plus. $160,000 - $220,000. Work from anywhere.
$160k - $220k
3d ago
Generative AI Engineer
FeaturedPerplexity· San Francisco, CA
Build next-generation search experiences powered by RAG and LLMs. Tech stack includes Python, TypeScript, OpenAI, Weaviate, and GraphQL APIs. $165,000 - $230,000. Hybrid - 3 days in office.
$165k - $230k
6d ago
Founding AI Engineer
FeaturedPerplexity· Remote
Join as a founding engineer to build our core AI agent and RAG infrastructure from the ground up. LangChain, LlamaIndex, OpenAI, Claude, and TypeScript experience preferred. Fully remote. $190,000 - $270,000 + significant equity.
$190k - $270k
12d ago
Research Engineer, RL Engineering
Anthropic· San Francisco, CA | New York City, NY | Seattle, WA
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role: You want to build the cutting-edge systems that train AI models like Claude. You're excited to work at the frontier of machine learning, implementing and improving advanced techniques to create ever more capable, reliable and steerable AI. As an ML Systems Engineer on our Reinforcement Learning Engineering team, you'll be responsible for the critical algorithms and infrastructure that our researchers depend on to train models. Your work will directly enable breakthroughs in AI capabilities and safety. You'll focus obsessively on improving the performance, robustness, and usability of these systems so our research can progress as quickly as possible. You're energized by the challenge of supporting and empowering our research team in the mission to build beneficial AI systems. Our finetuning researchers train our production Claude models, and internal research models, using RLHF and other related methods. Your job will be to build, maintain, and improve the algorithms and systems that these researchers use to train models. You’ll be responsible for improving the speed, reliability, and ease-of-use of these systems. You may be a good fit if you: Have 4+ years of software engineering experience Like working on systems and tools that make other people more productive Are results-oriented, with a bias towards flexibility and impact Pick up slack, even if it goes outside your job description Enjoy pair programming (we love to pair!) Want to learn more about machine learning research Care about the societal impacts of your work Strong candidates may also have experience with: High performance, large scale distributed systems Large scale LLM training Python Implementing LLM finetuning algorithms, such as RLHF Representative projects: Profiling our reinforcement learning pipeline to find opportunities for improvement Building a system that regularly launches training jobs in a test environment so that we can quickly detect problems in the training pipeline Making changes to our finetuning systems so they work on new model architectures Building instrumentation to detect and eliminate Python GIL contention in our training code Diagnosing why training runs have started slowing down after some number of steps, and fixing it Implementing a stable, fast version of a new training algorithm proposed by a researcher Deadline to apply: None. Applications will be reviewed on a rolling basis. The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $500,000 — $850,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
13h ago
Research Engineer, RL Engineering
Anthropic· San Francisco, CA | New York City, NY | Seattle, WA
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role: You want to build the cutting-edge systems that train AI models like Claude. You're excited to work at the frontier of machine learning, implementing and improving advanced techniques to create ever more capable, reliable and steerable AI. As an ML Systems Engineer on our Reinforcement Learning Engineering team, you'll be responsible for the critical algorithms and infrastructure that our researchers depend on to train models. Your work will directly enable breakthroughs in AI capabilities and safety. You'll focus obsessively on improving the performance, robustness, and usability of these systems so our research can progress as quickly as possible. You're energized by the challenge of supporting and empowering our research team in the mission to build beneficial AI systems. Our finetuning researchers train our production Claude models, and internal research models, using RLHF and other related methods. Your job will be to build, maintain, and improve the algorithms and systems that these researchers use to train models. You’ll be responsible for improving the speed, reliability, and ease-of-use of these systems. You may be a good fit if you: Have 4+ years of software engineering experience Like working on systems and tools that make other people more productive Are results-oriented, with a bias towards flexibility and impact Pick up slack, even if it goes outside your job description Enjoy pair programming (we love to pair!) Want to learn more about machine learning research Care about the societal impacts of your work Strong candidates may also have experience with: High performance, large scale distributed systems Large scale LLM training Python Implementing LLM finetuning algorithms, such as RLHF Representative projects: Profiling our reinforcement learning pipeline to find opportunities for improvement Building a system that regularly launches training jobs in a test environment so that we can quickly detect problems in the training pipeline Making changes to our finetuning systems so they work on new model architectures Building instrumentation to detect and eliminate Python GIL contention in our training code Diagnosing why training runs have started slowing down after some number of steps, and fixing it Implementing a stable, fast version of a new training algorithm proposed by a researcher Deadline to apply: None. Applications will be reviewed on a rolling basis. The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $500,000 — $850,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
13h ago
Systems Research Engineer Intern - GPU Programming (Fall 2026)
Together AI· San Francisco
About The Role As a Systems Research Engineer Intern specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation. Responsibilities Optimize and fine-tune GPU code to achieve better performance and scalability Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems Stay up-to-date with the latest advancements in GPU programming techniques and technologies Requirements Strong background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and models Knowledge of performance profiling and optimization tools for GPU programming Excellent problem-solving and analytical skills Internship Program Details Our fall internship program spans over 12 to 16 weeks where you’ll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are September 14th to December 18th. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Mamba, FlexGen, Petals, Mixture of Agents, and RedPajama. Compensation We offer competitive compensation, housing stipends, and other competitive benefits. The estimated US hourly rate for this role is $58 to $63. Our hourly rates are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Systems Research Engineer, GPU Programming
Together AI· San Francisco
About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation. Requirements Strong background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and models Knowledge of performance profiling and optimization tools for GPU programming Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experiences Responsibilities Optimize and fine-tune GPU code to achieve better performance and scalability Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems Stay up-to-date with the latest advancements in GPU programming techniques and technologies About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Staff Machine Learning Engineer, Voice AI
Together AI· San Francisco
About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Staff ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly. This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech. Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech. Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference. Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure. Build quality evaluation frameworks that guide model selection for customers and inform the roadmap. Join a small, early-stage team with outsized impact on a fast-growing product area. Responsibilities Own the voice inference roadmap end-to-end — define and execute the technical strategy for optimizing STT, TTS, and speech-to-speech models across Together's infrastructure, with a clear-eyed view of where the field is heading and how to position the platform ahead of it. Drive best-in-class inference performance — architect and implement systems targeting leading TTFB, throughput, and GPU utilization for voice workloads; set the performance bar others in the industry measure against, not just catch up to. Lead productionization of voice models at scale — design the serving architecture for serverless and dedicated endpoints, including batching strategies, streaming inference pipelines, and memory management tailored to real-time audio; own reliability and latency SLAs. Build the voice evaluation platform — design a rigorous, extensible evaluation framework covering WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation fidelity for TTS; establish the internal benchmark methodology that informs model selection and roadmap decisions. Shape the architecture for next-generation model support — anticipate and enable emerging model paradigms — audio-native LLMs, codec-based architectures (SNAC, Encodec), and end-to-end speech-to-speech systems — before they're mainstream, not after. Serve as the technical DRI for model partner integrations — lead deep collaboration with partners such as Cartesia, Deepgram, and Rime; own the full lifecycle from integration to optimization to ongoing performance accountability. Diagnose and resolve the hardest performance problems in the stack — conduct systematic profiling and root-cause analysis from GPU kernel behavior to framework-level bottlenecks; drive shipped improvements with documented, measurable impact. Influence platform architecture across the organization — partner with platform engineering leadership to ensure the serving layer is built for the latency and reliability demands of real-time voice APIs; your technical decisions should raise the ceiling for the whole team. Define and scale voice fine-tuning capabilities — lead the technical direction for enabling customers to fine-tune STT and TTS models on Together's infrastructure, establishing the primitives for differentiated voice experiences. Lay technical foundations for a category-defining product surface — architect systems with enough foresight that they support multiple new voice products with minimal rework; think in terms of platforms, not point solutions. Requirements 8+ years of ML engineering experience, with a demonstrated focus on model serving, inference optimization, or ML infrastructure at production scale — including systems you've owned from design through live traffic. Deep, practical expertise in LLM serving engines (vLLM, SGLang, TensorRT-LLM, or equivalent) — you've modified engine internals, debugged edge cases under load, and contributed improvements back; you don't stop at the API surface. Expert-level Python and PyTorch proficiency, with a strong command of GPU optimization — CUDA kernels, memory hierarchies, profiling toolchains — and a track record of turning that knowledge into shipped latency or throughput wins. Proven system design judgment — you've made architectural decisions that held up at scale and influenced how a team or platform evolved; you can articulate the tradeoffs you made and why. Strong technical leadership — you operate with high autonomy, define the right problems before solving them, and raise the bar for engineering quality around you without requiring process overhead. Sharp product intuition for developer tooling — you understand what voice application developers actually need to ship great products, and you let that shape your technical priorities, not just the other way around. Proven ability to move fast in ambiguous environments — you've thrived on early-stage or platform teams where scope is wide, ownership is deep, and the roadmap you build is the one you execute. Strong foundation in speech and audio ML (ASR/TTS architectures, audio signal processing) — directly relevant experience is strongly preferred; exceptional ML engineering fundamentals with genuine curiosity about the domain is also considered. Familiarity with audio codec and tokenization schemes (SNAC, Encodec, DAC) is a meaningful plus at this level. Experience training or fine-tuning speech models at scale is a significant advantage. Bachelor's or Master's in Computer Science, Electrical Engineering, or related field — or equivalent depth demonstrated through your work. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $220,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Staff Engineer, Distributed Storage and HPC & AI Infrastructure
Together AI· San Francisco
About the Role In this role, you will design and deliver multi-petabyte storage systems purpose-built for the world’s largest AI training and inference workloads. You’ll architect high-performance parallel filesystems and object stores, evaluate and integrate cutting-edge technologies such as WekaFS, Ceph, and Lustre, and drive aggressive cost optimization-routinely achieving 30-50% savings through intelligent tiering, lifecycle policies, capacity forecasting, and right-sizing. You will also build Kubernetes-native storage operators and self-service platforms that provide automated provisioning, strict multi-tenancy, performance isolation, and quota enforcement at cluster scale. Day-to-day, you’ll optimize end-to-end data paths for 10-50 GB/s per node, design multi-tier caching architectures, implement intelligent prefetching and model-weight distribution, and tune parallel filesystems for AI workloads. Responsibilities Design multi-petabyte AI/ML storage systems; integrate WekaFS, Ceph, etc.; lead capacity planning and cost optimization (30-50% savings via tiering, lifecycle policies, right-sizing). Design/optimize RDMA, InfiniBand, 400GbE networks; tune for max throughput/min latency; implement NVMe-oF/iSCSI; troubleshoot bottlenecks; optimize TCP/IP for storage. Build Kubernetes storage operators/controllers; enable automated provisioning, self-service abstractions, multi-tenant isolation, quotas; create reusable Helm/Terraform patterns. Deliver 10-50 GB/s per GPU node; optimize caching (weights/datasets/checkpoints), parallel filesystems, and data paths; troubleshoot with profiling tools; scale to thousands of nodes. Build multi-tier caches (local NVMe, distributed, object); optimize data locality and model-weight distribution; implement smart prefetching/eviction. Implement monitoring, alerting, SLOs; design DR/backups with runbooks; run chaos engineering; ensure 99.9%+ uptime via proactive/automated remediation. Partner with ML/SRE teams; mentor on storage best practices; contribute to open-source; write docs, postmortems, and public learnings. Requirements 8+ years in storage engineering with 3+ years managing distributed storage at multi-petabyte scale Proven track record deploying and operating high-performance storage for GPU/HPC clusters Deep Kubernetes and cloud-native storage experience in production environments Strong coding skills in Go and Python with demonstrated ability to build production-grade tools BS/MS in Computer Science, Engineering, or equivalent practical experience History of technical leadership: designing systems that significantly improved performance (>3x), reliability (99.9%+ uptime), or cost efficiency Distributed Storage Systems: Deep expertise in WekaFS, Lustre, GPFS, BeeGFS, or similar parallel filesystems at multi-petabyte scale Object Storage: Production experience with S3, MinIO, Ceph, or R2 including performance optimization and cost management Kubernetes Storage: CSI drivers, StatefulSets, PersistentVolumes, storage operators, and custom controllers Storage optimization for GPU workloads, RDMA/InfiniBand networking, parallel filesystem optimization (100+ GB/s aggregate cluster throughput) Programming: Go and Python for automation, operators, and tooling Infrastructure as Code: Terraform, Ansible, Helm, GitOps (ArgoCD) Linux Storage Stack: Advanced knowledge of filesystems (ext4, xfs), LVM, NVMe optimization, RAID configurations Observability: Prometheus, Grafana, Thanos architecture and operations Nice to Have Skills GPU Direct Storage (GDS), NVMe-oF, storage networking (100GbE/400GbE) ML/AI storage patterns (model weights, checkpointing, dataset caching) Kubernetes operator development (controller-runtime, kubebuilder) Storage snapshots, cloning, and thin provisioning Backup and disaster recovery (Velero, Restic, cross-region replication) Storage encryption (at-rest and in-transit), security and compliance Storage benchmarking and profiling tools (fio, iperf3, iostat, blktrace) About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $250,000 - $300,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Senior Machine Learning Engineer, Voice AI
Together AI· San Francisco
About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Senior ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly. This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech. Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech. Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference. Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure. Build quality evaluation frameworks that guide model selection for customers and inform the roadmap. Join a small, early-stage team with outsized impact on a fast-growing product area. Responsibilities Optimize inference performance for voice models (STT, TTS, speech-to-speech) — targeting best-in-class TTFB, throughput, and GPU utilization across our curated model set. Productionize voice models on serverless and dedicated endpoints, including batching strategies, streaming inference, and memory management tailored to audio workloads. Build and maintain a voice model evaluation framework — measuring WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation accuracy for TTS. Enable new model architectures in our serving stack as the field evolves, including audio-native LLMs, codec-based models (SNAC), and speech-to-speech systems. Collaborate with model partners to integrate and optimize their models (Cartesia, Deepgram, Rime, and others) running on Together's infrastructure. Profile and debug performance across the full inference stack — from GPU kernels to framework-level bottlenecks — and ship measurable improvements. Work with the platform engineering side of the team to ensure the serving layer meets the latency and reliability requirements of real-time voice APIs. Contribute to voice model fine-tuning capabilities (STT and TTS) as we enable customers to build differentiated voice experiences on Together. Lay the groundwork for multiple new products down the line. Requirements 5+ years of experience in ML engineering, with a focus on model serving, inference optimization, or ML infrastructure. Hands-on experience with LLM serving engines (vLLM, SGLang, TensorRT-LLM, or similar) — comfortable reading and modifying engine internals, not just using APIs. Strong proficiency in Python and PyTorch; experience with GPU profiling and optimization (CUDA, memory management, kernel-level debugging). Track record of shipping ML systems to production with measurable performance improvements. Strong product sense — you think about what developers building voice apps actually need, not just what's technically interesting. Comfort working on a small, early-stage team where you'll wear multiple hats and move fast. Experience with speech and audio ML (ASR, TTS architectures, audio signal processing) is a strong plus but not required — you can learn this quickly if you have strong ML engineering fundamentals. Familiarity with audio codecs and tokenization schemes (SNAC, Encodec, DAC) is a plus. Experience training or fine-tuning speech models is a plus. Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field, or equivalent practical experience About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $260,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Research Engineer, Frontier Speculative Decoding
Together AI· San Francisco, New York City
About the Role Together AI is building the Inference Platform that powers the world's most advanced generative AI models. Your role will be a critical bridge between cutting-edge research and real-world applications, focusing on making translating our internal model training research to production-ready deployment for our customers. This involves a deep commitment to data-centric development, meticulous hyperparameter tuning, and rigorous checkpoint evaluation before models ever hit production. This role will involve understanding customer specific needs and fine-tuning models on our internal data recipe and their proprietary data. The goal is to transform general-purpose models into highly performant, specialized tools that solve real business problems. You will not be training foundation models from scratch but rather focusing on creating highly efficient, specialized models by working with dedicated GPU clusters. Responsibilities Design and iterate on novel speculator algorithms, combining architectural innovations with carefully curated data to push the frontier of accuracy–efficiency tradeoffs. Be the critical link between raw data and a production-ready model, seeing your work directly impact our customers' success. Work in a fast-paced, high-impact role at the cutting edge of generative AI. Collaborate with a team of experts dedicated to solving real-world, high-performance challenges. You'll collaborate directly with customers to understand their needs, and work closely with our core inference and Applied ML research teams to integrate your work into the production platform. A culture of deep technical ownership where you are empowered to take on and solve challenging problems Requirements A genuine love for data curation and processing, with a meticulous attention to detail. You believe that great models start with great data. Demonstrated ability to perform effective hyperparameter searches and understand the trade-offs involved in tuning models for specific tasks. Experience working with and building on top of existing training codebases. You are comfortable navigating complex code and contributing to its improvement. Strong attention-to-detail in evaluating model checkpoints to ensure they meet strict quality, performance, and reliability standards. Experience with Python and PyTorch. Familiarity with SLURM and/or Kubernetes clusters and experience submitting and managing jobs in a high-performance computing environment. Familiarity with modern LLMs and generative models. Basic understanding of distributed training frameworks (e.g., FSDP, DeepSpeed). Bachelor’s, Master’s degree, or Ph.D. in Computer Science, Computer Engineering, or a related field, or equivalent practical experience. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, ATLAS, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $190,000 - $270,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Research Engineer, Core ML
Together AI· San Francisco
About the Role This is a research engineering role with direct production impact. You won’t be publishing ideas in isolation—you will translate new RL algorithms, scheduling methods, and inference optimizations into production-grade systems that power Together’s API. Success in this role means shipping measurable improvements in latency, throughput, cost, and model quality at scale. We are looking for researchers who enjoy owning systems end-to-end and turning frontier ideas into robust infrastructure. The Core ML (Turbo) at Together AI team sits at the intersection of efficient inference (algorithms, architectures, engines) and post‑training / RL systems. We build and operate the systems behind Together’s API, including high‑performance inference and RL/post‑training engines that can run at production scale. Our mandate is to push the frontier of efficient inference and RL‑driven training: making models dramatically faster and cheaper to run, while improving their capabilities through RL‑based post‑training (e.g., GRPO‑style objectives). This work lives at the interface of algorithms and systems: asynchronous RL, rollout collection, scheduling, and batching all interact with engine design, creating many knobs to tune across the RL algorithm, training loop, and inference stack. Much of the job is modifying production inference systems—for example, SGLang‑ or vLLM‑style serving stacks and speculative decoding systems such as ATLAS—grounded in a strong understanding of post‑training and inference theory, rather than purely theoretical algorithm design. You’ll work across the stack—from RL algorithms and training engines to kernels and serving systems—to build and improve frontier models via RL pipelines. People on this team are often spiky: some are more RL‑first, some are more systems‑first. Depth in one of these areas plus appetite to collaborate across (and grow toward more full‑stack ownership over time) is ideal. Responsibilities Advance inference efficiency end‑to‑end Design and prototype algorithms, architectures, and scheduling strategies for low‑latency, high‑throughput inference. Implement and maintain changes in high‑performance inference engines (e.g., SGLang‑ or vLLM‑style systems and Together’s inference stack), including kernel backends, speculative decoding (e.g., ATLAS), quantization, etc. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unify inference with RL / post‑training Design and operate RL and post‑training pipelines (e.g., RLHF, RLAIF, GRPO, DPO‑style methods, reward modeling) where 90+% of the cost is inference, jointly optimizing algorithms and systems. Make RL and post‑training workloads more efficient with inference‑aware training loops—for example, async RL rollouts, speculative decoding, and other techniques that make large‑scale rollout collection and evaluation cheaper. Use these pipelines to train, evaluate, and iterate on frontier models on top of our inference stack. Co‑design algorithms and infrastructure so that objectives, rollout collection, and evaluation are tightly coupled to efficient inference, and quickly identify bottlenecks across the training engine, inference engine, data pipeline, and user‑facing layers. Run ablations and scale‑up experiments to understand trade‑offs between model quality, latency, throughput, and cost, and feed these insights back into model, RL, and system design. Own critical systems at production scale Profile, debug, and optimize inference and post-training services under real production workloads, taking research ideas all the way to stable, measurable improvements in deployed systems. Drive roadmap items that require real engine modification—changing kernels, memory layouts, scheduling logic, and APIs as needed. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership (Staff level) Set technical direction for cross‑team efforts at the intersection of inference, RL, and post‑training. Mentor other engineers and researchers on full‑stack ML systems work and performance engineering. Requirements We don’t expect anyone to check every box below. People on this team typically have deep expertise in one or more areas and enough breadth (or interest) to work effectively across the stack. The closer you are to full‑stack (inference + post‑training/RL + systems), the stronger the fit—but being spiky in one area and eager to grow is absolutely okay. You might be a good fit if you: Have a bias toward implementation and shipping —you are excited to modify real engines and services, not just prototype in research code. Have strong expertise in at least one of the following, and are excited to collaborate across (and grow into) the others: Systems‑first profile: Large‑scale inference systems (e.g., SGLang, vLLM, FasterTransformer, TensorRT, custom engines, or similar), GPU performance, distributed serving. RL‑first profile: RL / post‑training for LLMs or large models (e.g., GRPO, RLHF/RLAIF, DPO‑like methods, reward modeling), and using these to train or fine‑tune real models. Model architecture design for Transformers or other large neural nets. Distributed systems / high‑performance computing for ML. Are comfortable working from algorithms to engines: Strong coding ability in Python Experience profiling and optimizing performance across GPU, networking, and memory layers. Able to take a new sampling method, scheduler, or RL update and turn it into a production‑grade implementation in the engine and/or training stack. Have a solid research foundation in your area(s) of depth: Track record of impactful work in ML systems, RL, or large‑scale model training (papers, open‑source projects, or production systems). Can read new RL / post‑training papers, understand their implications on the stack, and design minimal, correct changes in the right layer (training engine vs. inference engine vs. data / API). Operate well as a full‑stack problem solver: You naturally ask: “Where in the stack is this really bottlenecked?” You enjoy collaborating with infra, research, and product teams, and you care about both scientific quality and user‑visible wins. Minimum qualifications 3+ years of experience working on ML systems, large‑scale model training, inference, or adjacent areas (or equivalent experience via research / open source). Advanced degree in Computer Science, EE, or a related field, or equivalent practical experience. Demonstrated experience owning complex technical projects end‑to‑end. If you’re excited about the role and strong in some of these areas, we encourage you to apply even if you don’t meet every single requirement. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning, Platform Engineer
Together AI· San Francisco
About the Role Our team focuses on enabling custom models and dedicated inference on Together. We are responsible for building a container platform, optimizing autoscaling, minimizing cold starts, achieving the best end-to-end model performance, and providing a best-in-class developer experience with great tooling. We often focus on video or audio generation across the stack: CUDA kernels, pytorch optimization, inference engines, container orchestration, queueing theory, etc. An ideal candidate will be great at profiling/optimization but know the word kubernetes, or be intimately familiar with multi-cluster scheduling and have some sense of ML bottlenecks. Responsibilities New hires may work on multi-cluster orchestration, portfolio optimization, predictive autoscaling, control panes, model bring-up, model optimization, APIs for managing deployments, inference worker SDKs, and CLI tools. Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure Partner with product teams to understand functional requirements and deliver solutions that meet business needs Write clear, well-tested, and maintainable software and IaC for both new and existing systems Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance Requirements 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems. Experience running serverless inference platforms, doing model bring-up on short notice, being on call, or running a cloud provider is a very big plus Good taste and ability to thoughtfully discuss how what you’ve built has failed over time Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources Excellent understanding of low level operating systems concepts including concurrency, networking and storage, performance and scale Expert-level programmer in one or more of Python, Golang, Rust, C++, or Haskell Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform Experience with Kubernetes internals or other container orchestration systems Sound judgement for when to use and when to not use LLMs for code Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience Writing-heavy roles or companies are a plus About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning Engineer - Inference
Together AI· San Francisco
About the Role Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and effectively at scale. If you are passionate about AI inference, PyTorch, and developing high-performance systems, we want to hear from you. This position offers the chance to collaborate closely with AI researchers and engineers to create cutting-edge AI solutions. Join us in shaping the future at Together AI! Responsibilities Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale. Develop and optimize runtime inference services for large-scale AI applications. Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world. Conduct design and code reviews to ensure high standards of quality. Create services, tools, and developer documentation to support the inference engine. Implement robust and fault-tolerant systems for data ingestion and processing. Requirements 3+ years of experience writing high-performance, well-tested, production-quality code. Proficiency with Python and PyTorch. Demonstrated experience in building high performance libraries and tooling. Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale. Preferred: Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum Preferred: Knowledge of AI inference techniques such as speculative decoding. Preferred: Knowledge of CUDA/Triton programming. Nice to have: Knowledge of Rust, Cython and compilers. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society. Together, we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI. Our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey to build the next-generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other competitive benefits. The US base salary range for this full-time position is $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level, and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunities to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning Engineer
Together AI· San Francisco
About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs. Requirements 5+ years experience writing high-performance, well-tested, production quality code Bachelor’s degree in computer science or equivalent industry experience Familiar with LLM inference ecosystem, including frameworks and engines (e.g. vLLM, SGLang, TRT, ...) Demonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation Expert level programmer in one or more of Python, Go, Rust, or C/C++ Experience implementing runtime inference services at scale or similar Responsibilities Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world Analyze and improve efficiency, scalability, and stability of various system resources Conduct design and code reviews Create services, tools & developer documentation Create testing frameworks for robustness and fault-tolerance Participate in an on-call rotation to respond to critical incidents as needed About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
LLM Inference Frameworks and Optimization Engineer
Together AI· San Francisco, Singapore, Amsterdam
About the Role At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-efficiency. We are seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines that support multimodal and language models at scale. This role will focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design, ensuring efficient large-scale deployment of LLMs and vision models. This role offers a unique opportunity to shape the future of LLM inference infrastructure, ensuring scalable, high-performance AI deployment across a diverse range of applications. If you're passionate about pushing the boundaries of AI inference, we’d love to hear from you! Responsibilities Inference Framework Development and Optimization Design and develop fault-tolerant, high-concurrency distributed inference engine for text, image, and multimodal generation models. Implement and optimize distributed inference strategies, including Mixture of Experts (MoE) parallelism, tensor parallelism, pipeline parallelism for high-performance serving. Apply CUDA graph optimizations, TensorRT/TRT-LLM graph optimizations, and PyTorch-based compilation (torch.compile), and speculative decoding to enhance efficiency and scalability. Software-Hardware Co-Design and AI Infrastructure Collaborate with hardware teams on performance bottleneck analysis, co-optimize inference performance for GPUs, TPUs, or custom accelerators. Work closely with AI researchers and infrastructure engineers to develop efficient model execution plans and optimize E2E model serving pipelines. Requirements Must-Have: Experience: 3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing. Technical Skills: Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)). Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling. Deep understanding of KV cache systems like Mooncake , PagedAttention , or custom in-house variants. Programming: Proficient in Python and C++/CUDA for high-performance deep learning inference. Optimization Techniques: Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization. Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels Soft Skills: Strong analytical problem-solving skills with a performance-driven mindset. Excellent collaboration and communication skills across teams. Nice-to-Have: Experience in developing software systems for large-scale data center networks with RDMA/RoCE Familiar with distributed filesystem(e.g., 3FS, HDFS, Ceph) Familiar with open source distributed scheduling/orchestration frameworks, such as Kubernetes (K8S) Contributions to open-source deep learning inference projects. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
AI infrastructure Engineer (SRE) Amsterdam
Together AI· Amsterdam
As a AI Infrastructure Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Requirements 7+ years of professional SRE or related experience Bachelor's degree in Computer Science or a related field or equivalent work experience Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes Proficiency in programming/scripting languages Direct experience in monitoring and observability practices Advanced knowledge of cloud services Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts Responsibilities Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users Build monitoring systems to ensure the highest quality service for our customers Design and implement operational processes (such as deployments and upgrades) Debug production issues across all services and levels of the stack Identify improvements for the product architecture from the reliability, performance and availability perspectives Plan the growth of Together AI’s infrastructure About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
AI Infrastructure Engineer
Together AI· San Francisco
As an AI Infrastructure Engineer at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Responsibilities Participate in on-call rotation (Pagerduty) to respond to production incidents Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users Build monitoring systems to ensure the highest quality service for our customers Design and implement operational processes (such as deployments and upgrades) Debug production issues across all services and levels of the stack Identify improvements for the product architecture from the reliability, performance and availability perspectives Plan the growth of Together AI's infrastructure Requirements 5+ years of professional AI Infra or related experience Bachelor's degree in Computer Science or a related field or equivalent work experience Knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes Proficiency in programming/scripting languages Direct experience in monitoring and observability practices Knowledge of cloud services Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $190,000 - $270,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
17h ago
Systems Research Engineer Intern - GPU Programming (Fall 2026)
Together AI· San Francisco
About The Role As a Systems Research Engineer Intern specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation. Responsibilities Optimize and fine-tune GPU code to achieve better performance and scalability Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems Stay up-to-date with the latest advancements in GPU programming techniques and technologies Requirements Strong background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and models Knowledge of performance profiling and optimization tools for GPU programming Excellent problem-solving and analytical skills Internship Program Details Our fall internship program spans over 12 to 16 weeks where you’ll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are September 14th to December 18th. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Mamba, FlexGen, Petals, Mixture of Agents, and RedPajama. Compensation We offer competitive compensation, housing stipends, and other competitive benefits. The estimated US hourly rate for this role is $58 to $63. Our hourly rates are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Systems Research Engineer, GPU Programming
Together AI· San Francisco
About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation. Requirements Strong background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and models Knowledge of performance profiling and optimization tools for GPU programming Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or equivalent practical experiences Responsibilities Optimize and fine-tune GPU code to achieve better performance and scalability Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems Stay up-to-date with the latest advancements in GPU programming techniques and technologies About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Staff Machine Learning Engineer, Voice AI
Together AI· San Francisco
About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Staff ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly. This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech. Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech. Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference. Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure. Build quality evaluation frameworks that guide model selection for customers and inform the roadmap. Join a small, early-stage team with outsized impact on a fast-growing product area. Responsibilities Own the voice inference roadmap end-to-end — define and execute the technical strategy for optimizing STT, TTS, and speech-to-speech models across Together's infrastructure, with a clear-eyed view of where the field is heading and how to position the platform ahead of it. Drive best-in-class inference performance — architect and implement systems targeting leading TTFB, throughput, and GPU utilization for voice workloads; set the performance bar others in the industry measure against, not just catch up to. Lead productionization of voice models at scale — design the serving architecture for serverless and dedicated endpoints, including batching strategies, streaming inference pipelines, and memory management tailored to real-time audio; own reliability and latency SLAs. Build the voice evaluation platform — design a rigorous, extensible evaluation framework covering WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation fidelity for TTS; establish the internal benchmark methodology that informs model selection and roadmap decisions. Shape the architecture for next-generation model support — anticipate and enable emerging model paradigms — audio-native LLMs, codec-based architectures (SNAC, Encodec), and end-to-end speech-to-speech systems — before they're mainstream, not after. Serve as the technical DRI for model partner integrations — lead deep collaboration with partners such as Cartesia, Deepgram, and Rime; own the full lifecycle from integration to optimization to ongoing performance accountability. Diagnose and resolve the hardest performance problems in the stack — conduct systematic profiling and root-cause analysis from GPU kernel behavior to framework-level bottlenecks; drive shipped improvements with documented, measurable impact. Influence platform architecture across the organization — partner with platform engineering leadership to ensure the serving layer is built for the latency and reliability demands of real-time voice APIs; your technical decisions should raise the ceiling for the whole team. Define and scale voice fine-tuning capabilities — lead the technical direction for enabling customers to fine-tune STT and TTS models on Together's infrastructure, establishing the primitives for differentiated voice experiences. Lay technical foundations for a category-defining product surface — architect systems with enough foresight that they support multiple new voice products with minimal rework; think in terms of platforms, not point solutions. Requirements 8+ years of ML engineering experience, with a demonstrated focus on model serving, inference optimization, or ML infrastructure at production scale — including systems you've owned from design through live traffic. Deep, practical expertise in LLM serving engines (vLLM, SGLang, TensorRT-LLM, or equivalent) — you've modified engine internals, debugged edge cases under load, and contributed improvements back; you don't stop at the API surface. Expert-level Python and PyTorch proficiency, with a strong command of GPU optimization — CUDA kernels, memory hierarchies, profiling toolchains — and a track record of turning that knowledge into shipped latency or throughput wins. Proven system design judgment — you've made architectural decisions that held up at scale and influenced how a team or platform evolved; you can articulate the tradeoffs you made and why. Strong technical leadership — you operate with high autonomy, define the right problems before solving them, and raise the bar for engineering quality around you without requiring process overhead. Sharp product intuition for developer tooling — you understand what voice application developers actually need to ship great products, and you let that shape your technical priorities, not just the other way around. Proven ability to move fast in ambiguous environments — you've thrived on early-stage or platform teams where scope is wide, ownership is deep, and the roadmap you build is the one you execute. Strong foundation in speech and audio ML (ASR/TTS architectures, audio signal processing) — directly relevant experience is strongly preferred; exceptional ML engineering fundamentals with genuine curiosity about the domain is also considered. Familiarity with audio codec and tokenization schemes (SNAC, Encodec, DAC) is a meaningful plus at this level. Experience training or fine-tuning speech models at scale is a significant advantage. Bachelor's or Master's in Computer Science, Electrical Engineering, or related field — or equivalent depth demonstrated through your work. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $220,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Staff Engineer, Distributed Storage and HPC & AI Infrastructure
Together AI· San Francisco
About the Role In this role, you will design and deliver multi-petabyte storage systems purpose-built for the world’s largest AI training and inference workloads. You’ll architect high-performance parallel filesystems and object stores, evaluate and integrate cutting-edge technologies such as WekaFS, Ceph, and Lustre, and drive aggressive cost optimization-routinely achieving 30-50% savings through intelligent tiering, lifecycle policies, capacity forecasting, and right-sizing. You will also build Kubernetes-native storage operators and self-service platforms that provide automated provisioning, strict multi-tenancy, performance isolation, and quota enforcement at cluster scale. Day-to-day, you’ll optimize end-to-end data paths for 10-50 GB/s per node, design multi-tier caching architectures, implement intelligent prefetching and model-weight distribution, and tune parallel filesystems for AI workloads. Responsibilities Design multi-petabyte AI/ML storage systems; integrate WekaFS, Ceph, etc.; lead capacity planning and cost optimization (30-50% savings via tiering, lifecycle policies, right-sizing). Design/optimize RDMA, InfiniBand, 400GbE networks; tune for max throughput/min latency; implement NVMe-oF/iSCSI; troubleshoot bottlenecks; optimize TCP/IP for storage. Build Kubernetes storage operators/controllers; enable automated provisioning, self-service abstractions, multi-tenant isolation, quotas; create reusable Helm/Terraform patterns. Deliver 10-50 GB/s per GPU node; optimize caching (weights/datasets/checkpoints), parallel filesystems, and data paths; troubleshoot with profiling tools; scale to thousands of nodes. Build multi-tier caches (local NVMe, distributed, object); optimize data locality and model-weight distribution; implement smart prefetching/eviction. Implement monitoring, alerting, SLOs; design DR/backups with runbooks; run chaos engineering; ensure 99.9%+ uptime via proactive/automated remediation. Partner with ML/SRE teams; mentor on storage best practices; contribute to open-source; write docs, postmortems, and public learnings. Requirements 8+ years in storage engineering with 3+ years managing distributed storage at multi-petabyte scale Proven track record deploying and operating high-performance storage for GPU/HPC clusters Deep Kubernetes and cloud-native storage experience in production environments Strong coding skills in Go and Python with demonstrated ability to build production-grade tools BS/MS in Computer Science, Engineering, or equivalent practical experience History of technical leadership: designing systems that significantly improved performance (>3x), reliability (99.9%+ uptime), or cost efficiency Distributed Storage Systems: Deep expertise in WekaFS, Lustre, GPFS, BeeGFS, or similar parallel filesystems at multi-petabyte scale Object Storage: Production experience with S3, MinIO, Ceph, or R2 including performance optimization and cost management Kubernetes Storage: CSI drivers, StatefulSets, PersistentVolumes, storage operators, and custom controllers Storage optimization for GPU workloads, RDMA/InfiniBand networking, parallel filesystem optimization (100+ GB/s aggregate cluster throughput) Programming: Go and Python for automation, operators, and tooling Infrastructure as Code: Terraform, Ansible, Helm, GitOps (ArgoCD) Linux Storage Stack: Advanced knowledge of filesystems (ext4, xfs), LVM, NVMe optimization, RAID configurations Observability: Prometheus, Grafana, Thanos architecture and operations Nice to Have Skills GPU Direct Storage (GDS), NVMe-oF, storage networking (100GbE/400GbE) ML/AI storage patterns (model weights, checkpointing, dataset caching) Kubernetes operator development (controller-runtime, kubebuilder) Storage snapshots, cloning, and thin provisioning Backup and disaster recovery (Velero, Restic, cross-region replication) Storage encryption (at-rest and in-transit), security and compliance Storage benchmarking and profiling tools (fio, iperf3, iostat, blktrace) About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $250,000 - $300,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Senior Machine Learning Engineer, Voice AI
Together AI· San Francisco
About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Senior ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly. This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech. Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech. Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference. Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure. Build quality evaluation frameworks that guide model selection for customers and inform the roadmap. Join a small, early-stage team with outsized impact on a fast-growing product area. Responsibilities Optimize inference performance for voice models (STT, TTS, speech-to-speech) — targeting best-in-class TTFB, throughput, and GPU utilization across our curated model set. Productionize voice models on serverless and dedicated endpoints, including batching strategies, streaming inference, and memory management tailored to audio workloads. Build and maintain a voice model evaluation framework — measuring WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation accuracy for TTS. Enable new model architectures in our serving stack as the field evolves, including audio-native LLMs, codec-based models (SNAC), and speech-to-speech systems. Collaborate with model partners to integrate and optimize their models (Cartesia, Deepgram, Rime, and others) running on Together's infrastructure. Profile and debug performance across the full inference stack — from GPU kernels to framework-level bottlenecks — and ship measurable improvements. Work with the platform engineering side of the team to ensure the serving layer meets the latency and reliability requirements of real-time voice APIs. Contribute to voice model fine-tuning capabilities (STT and TTS) as we enable customers to build differentiated voice experiences on Together. Lay the groundwork for multiple new products down the line. Requirements 5+ years of experience in ML engineering, with a focus on model serving, inference optimization, or ML infrastructure. Hands-on experience with LLM serving engines (vLLM, SGLang, TensorRT-LLM, or similar) — comfortable reading and modifying engine internals, not just using APIs. Strong proficiency in Python and PyTorch; experience with GPU profiling and optimization (CUDA, memory management, kernel-level debugging). Track record of shipping ML systems to production with measurable performance improvements. Strong product sense — you think about what developers building voice apps actually need, not just what's technically interesting. Comfort working on a small, early-stage team where you'll wear multiple hats and move fast. Experience with speech and audio ML (ASR, TTS architectures, audio signal processing) is a strong plus but not required — you can learn this quickly if you have strong ML engineering fundamentals. Familiarity with audio codecs and tokenization schemes (SNAC, Encodec, DAC) is a plus. Experience training or fine-tuning speech models is a plus. Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field, or equivalent practical experience About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $260,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Research Engineer, Frontier Speculative Decoding
Together AI· San Francisco, New York City
About the Role Together AI is building the Inference Platform that powers the world's most advanced generative AI models. Your role will be a critical bridge between cutting-edge research and real-world applications, focusing on making translating our internal model training research to production-ready deployment for our customers. This involves a deep commitment to data-centric development, meticulous hyperparameter tuning, and rigorous checkpoint evaluation before models ever hit production. This role will involve understanding customer specific needs and fine-tuning models on our internal data recipe and their proprietary data. The goal is to transform general-purpose models into highly performant, specialized tools that solve real business problems. You will not be training foundation models from scratch but rather focusing on creating highly efficient, specialized models by working with dedicated GPU clusters. Responsibilities Design and iterate on novel speculator algorithms, combining architectural innovations with carefully curated data to push the frontier of accuracy–efficiency tradeoffs. Be the critical link between raw data and a production-ready model, seeing your work directly impact our customers' success. Work in a fast-paced, high-impact role at the cutting edge of generative AI. Collaborate with a team of experts dedicated to solving real-world, high-performance challenges. You'll collaborate directly with customers to understand their needs, and work closely with our core inference and Applied ML research teams to integrate your work into the production platform. A culture of deep technical ownership where you are empowered to take on and solve challenging problems Requirements A genuine love for data curation and processing, with a meticulous attention to detail. You believe that great models start with great data. Demonstrated ability to perform effective hyperparameter searches and understand the trade-offs involved in tuning models for specific tasks. Experience working with and building on top of existing training codebases. You are comfortable navigating complex code and contributing to its improvement. Strong attention-to-detail in evaluating model checkpoints to ensure they meet strict quality, performance, and reliability standards. Experience with Python and PyTorch. Familiarity with SLURM and/or Kubernetes clusters and experience submitting and managing jobs in a high-performance computing environment. Familiarity with modern LLMs and generative models. Basic understanding of distributed training frameworks (e.g., FSDP, DeepSpeed). Bachelor’s, Master’s degree, or Ph.D. in Computer Science, Computer Engineering, or a related field, or equivalent practical experience. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, ATLAS, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $190,000 - $270,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Research Engineer, Core ML
Together AI· San Francisco
About the Role This is a research engineering role with direct production impact. You won’t be publishing ideas in isolation—you will translate new RL algorithms, scheduling methods, and inference optimizations into production-grade systems that power Together’s API. Success in this role means shipping measurable improvements in latency, throughput, cost, and model quality at scale. We are looking for researchers who enjoy owning systems end-to-end and turning frontier ideas into robust infrastructure. The Core ML (Turbo) at Together AI team sits at the intersection of efficient inference (algorithms, architectures, engines) and post‑training / RL systems. We build and operate the systems behind Together’s API, including high‑performance inference and RL/post‑training engines that can run at production scale. Our mandate is to push the frontier of efficient inference and RL‑driven training: making models dramatically faster and cheaper to run, while improving their capabilities through RL‑based post‑training (e.g., GRPO‑style objectives). This work lives at the interface of algorithms and systems: asynchronous RL, rollout collection, scheduling, and batching all interact with engine design, creating many knobs to tune across the RL algorithm, training loop, and inference stack. Much of the job is modifying production inference systems—for example, SGLang‑ or vLLM‑style serving stacks and speculative decoding systems such as ATLAS—grounded in a strong understanding of post‑training and inference theory, rather than purely theoretical algorithm design. You’ll work across the stack—from RL algorithms and training engines to kernels and serving systems—to build and improve frontier models via RL pipelines. People on this team are often spiky: some are more RL‑first, some are more systems‑first. Depth in one of these areas plus appetite to collaborate across (and grow toward more full‑stack ownership over time) is ideal. Responsibilities Advance inference efficiency end‑to‑end Design and prototype algorithms, architectures, and scheduling strategies for low‑latency, high‑throughput inference. Implement and maintain changes in high‑performance inference engines (e.g., SGLang‑ or vLLM‑style systems and Together’s inference stack), including kernel backends, speculative decoding (e.g., ATLAS), quantization, etc. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unify inference with RL / post‑training Design and operate RL and post‑training pipelines (e.g., RLHF, RLAIF, GRPO, DPO‑style methods, reward modeling) where 90+% of the cost is inference, jointly optimizing algorithms and systems. Make RL and post‑training workloads more efficient with inference‑aware training loops—for example, async RL rollouts, speculative decoding, and other techniques that make large‑scale rollout collection and evaluation cheaper. Use these pipelines to train, evaluate, and iterate on frontier models on top of our inference stack. Co‑design algorithms and infrastructure so that objectives, rollout collection, and evaluation are tightly coupled to efficient inference, and quickly identify bottlenecks across the training engine, inference engine, data pipeline, and user‑facing layers. Run ablations and scale‑up experiments to understand trade‑offs between model quality, latency, throughput, and cost, and feed these insights back into model, RL, and system design. Own critical systems at production scale Profile, debug, and optimize inference and post-training services under real production workloads, taking research ideas all the way to stable, measurable improvements in deployed systems. Drive roadmap items that require real engine modification—changing kernels, memory layouts, scheduling logic, and APIs as needed. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership (Staff level) Set technical direction for cross‑team efforts at the intersection of inference, RL, and post‑training. Mentor other engineers and researchers on full‑stack ML systems work and performance engineering. Requirements We don’t expect anyone to check every box below. People on this team typically have deep expertise in one or more areas and enough breadth (or interest) to work effectively across the stack. The closer you are to full‑stack (inference + post‑training/RL + systems), the stronger the fit—but being spiky in one area and eager to grow is absolutely okay. You might be a good fit if you: Have a bias toward implementation and shipping —you are excited to modify real engines and services, not just prototype in research code. Have strong expertise in at least one of the following, and are excited to collaborate across (and grow into) the others: Systems‑first profile: Large‑scale inference systems (e.g., SGLang, vLLM, FasterTransformer, TensorRT, custom engines, or similar), GPU performance, distributed serving. RL‑first profile: RL / post‑training for LLMs or large models (e.g., GRPO, RLHF/RLAIF, DPO‑like methods, reward modeling), and using these to train or fine‑tune real models. Model architecture design for Transformers or other large neural nets. Distributed systems / high‑performance computing for ML. Are comfortable working from algorithms to engines: Strong coding ability in Python Experience profiling and optimizing performance across GPU, networking, and memory layers. Able to take a new sampling method, scheduler, or RL update and turn it into a production‑grade implementation in the engine and/or training stack. Have a solid research foundation in your area(s) of depth: Track record of impactful work in ML systems, RL, or large‑scale model training (papers, open‑source projects, or production systems). Can read new RL / post‑training papers, understand their implications on the stack, and design minimal, correct changes in the right layer (training engine vs. inference engine vs. data / API). Operate well as a full‑stack problem solver: You naturally ask: “Where in the stack is this really bottlenecked?” You enjoy collaborating with infra, research, and product teams, and you care about both scientific quality and user‑visible wins. Minimum qualifications 3+ years of experience working on ML systems, large‑scale model training, inference, or adjacent areas (or equivalent experience via research / open source). Advanced degree in Computer Science, EE, or a related field, or equivalent practical experience. Demonstrated experience owning complex technical projects end‑to‑end. If you’re excited about the role and strong in some of these areas, we encourage you to apply even if you don’t meet every single requirement. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning, Platform Engineer
Together AI· San Francisco
About the Role Our team focuses on enabling custom models and dedicated inference on Together. We are responsible for building a container platform, optimizing autoscaling, minimizing cold starts, achieving the best end-to-end model performance, and providing a best-in-class developer experience with great tooling. We often focus on video or audio generation across the stack: CUDA kernels, pytorch optimization, inference engines, container orchestration, queueing theory, etc. An ideal candidate will be great at profiling/optimization but know the word kubernetes, or be intimately familiar with multi-cluster scheduling and have some sense of ML bottlenecks. Responsibilities New hires may work on multi-cluster orchestration, portfolio optimization, predictive autoscaling, control panes, model bring-up, model optimization, APIs for managing deployments, inference worker SDKs, and CLI tools. Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure Partner with product teams to understand functional requirements and deliver solutions that meet business needs Write clear, well-tested, and maintainable software and IaC for both new and existing systems Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance Requirements 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems. Experience running serverless inference platforms, doing model bring-up on short notice, being on call, or running a cloud provider is a very big plus Good taste and ability to thoughtfully discuss how what you’ve built has failed over time Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources Excellent understanding of low level operating systems concepts including concurrency, networking and storage, performance and scale Expert-level programmer in one or more of Python, Golang, Rust, C++, or Haskell Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform Experience with Kubernetes internals or other container orchestration systems Sound judgement for when to use and when to not use LLMs for code Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience Writing-heavy roles or companies are a plus About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning Engineer - Inference
Together AI· San Francisco
About the Role Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and effectively at scale. If you are passionate about AI inference, PyTorch, and developing high-performance systems, we want to hear from you. This position offers the chance to collaborate closely with AI researchers and engineers to create cutting-edge AI solutions. Join us in shaping the future at Together AI! Responsibilities Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale. Develop and optimize runtime inference services for large-scale AI applications. Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world. Conduct design and code reviews to ensure high standards of quality. Create services, tools, and developer documentation to support the inference engine. Implement robust and fault-tolerant systems for data ingestion and processing. Requirements 3+ years of experience writing high-performance, well-tested, production-quality code. Proficiency with Python and PyTorch. Demonstrated experience in building high performance libraries and tooling. Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale. Preferred: Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum Preferred: Knowledge of AI inference techniques such as speculative decoding. Preferred: Knowledge of CUDA/Triton programming. Nice to have: Knowledge of Rust, Cython and compilers. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society. Together, we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI. Our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey to build the next-generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other competitive benefits. The US base salary range for this full-time position is $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level, and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunities to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
Machine Learning Engineer
Together AI· San Francisco
About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs. Requirements 5+ years experience writing high-performance, well-tested, production quality code Bachelor’s degree in computer science or equivalent industry experience Familiar with LLM inference ecosystem, including frameworks and engines (e.g. vLLM, SGLang, TRT, ...) Demonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation Expert level programmer in one or more of Python, Go, Rust, or C/C++ Experience implementing runtime inference services at scale or similar Responsibilities Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world Analyze and improve efficiency, scalability, and stability of various system resources Conduct design and code reviews Create services, tools & developer documentation Create testing frameworks for robustness and fault-tolerance Participate in an on-call rotation to respond to critical incidents as needed About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
LLM Inference Frameworks and Optimization Engineer
Together AI· San Francisco, Singapore, Amsterdam
About the Role At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-efficiency. We are seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines that support multimodal and language models at scale. This role will focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design, ensuring efficient large-scale deployment of LLMs and vision models. This role offers a unique opportunity to shape the future of LLM inference infrastructure, ensuring scalable, high-performance AI deployment across a diverse range of applications. If you're passionate about pushing the boundaries of AI inference, we’d love to hear from you! Responsibilities Inference Framework Development and Optimization Design and develop fault-tolerant, high-concurrency distributed inference engine for text, image, and multimodal generation models. Implement and optimize distributed inference strategies, including Mixture of Experts (MoE) parallelism, tensor parallelism, pipeline parallelism for high-performance serving. Apply CUDA graph optimizations, TensorRT/TRT-LLM graph optimizations, and PyTorch-based compilation (torch.compile), and speculative decoding to enhance efficiency and scalability. Software-Hardware Co-Design and AI Infrastructure Collaborate with hardware teams on performance bottleneck analysis, co-optimize inference performance for GPUs, TPUs, or custom accelerators. Work closely with AI researchers and infrastructure engineers to develop efficient model execution plans and optimize E2E model serving pipelines. Requirements Must-Have: Experience: 3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing. Technical Skills: Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)). Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling. Deep understanding of KV cache systems like Mooncake , PagedAttention , or custom in-house variants. Programming: Proficient in Python and C++/CUDA for high-performance deep learning inference. Optimization Techniques: Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization. Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels Soft Skills: Strong analytical problem-solving skills with a performance-driven mindset. Excellent collaboration and communication skills across teams. Nice-to-Have: Experience in developing software systems for large-scale data center networks with RDMA/RoCE Familiar with distributed filesystem(e.g., 3FS, HDFS, Ceph) Familiar with open source distributed scheduling/orchestration frameworks, such as Kubernetes (K8S) Contributions to open-source deep learning inference projects. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
AI infrastructure Engineer (SRE) Amsterdam
Together AI· Amsterdam
As a AI Infrastructure Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Requirements 7+ years of professional SRE or related experience Bachelor's degree in Computer Science or a related field or equivalent work experience Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes Proficiency in programming/scripting languages Direct experience in monitoring and observability practices Advanced knowledge of cloud services Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts Responsibilities Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users Build monitoring systems to ensure the highest quality service for our customers Design and implement operational processes (such as deployments and upgrades) Debug production issues across all services and levels of the stack Identify improvements for the product architecture from the reliability, performance and availability perspectives Plan the growth of Together AI’s infrastructure About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
17h ago
AI Infrastructure Engineer
Together AI· San Francisco
As an AI Infrastructure Engineer at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Responsibilities Participate in on-call rotation (Pagerduty) to respond to production incidents Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users Build monitoring systems to ensure the highest quality service for our customers Design and implement operational processes (such as deployments and upgrades) Debug production issues across all services and levels of the stack Identify improvements for the product architecture from the reliability, performance and availability perspectives Plan the growth of Together AI's infrastructure Requirements 5+ years of professional AI Infra or related experience Bachelor's degree in Computer Science or a related field or equivalent work experience Knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes Proficiency in programming/scripting languages Direct experience in monitoring and observability practices Knowledge of cloud services Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $190,000 - $270,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
17h ago
Strategic Finance Manager, Gen AI
Scale AI· San Francisco, CA
We are building out the Finance team to help make data-driven and financially sound decisions for Scale. The Finance team drives strategic, financial, and operational decisions by partnering with the leadership team to make critical decisions across Scale. We’re looking for a high-performing, all-rounded finance athlete to join our team and support the rapidly growing Generative AI (GenAI) business. You’ll collaborate closely with Product, Operations, Growth, and Go-to-Market leaders to bring financial rigor to decision-making, develop actionable insights that drive strategy, and build scalable systems as the business expands. This role is ideal for someone with 4-6 years of experience in a fast-paced, high-growth environment. Someone who thrives in ambiguity, can juggle multiple workstreams, and brings a mix of analytical rigor, business acumen, and strong execution. You will: Own and evolve part of the GenAI financial forecasting model, driving accuracy and insight across planning cycles Support reporting and performance management, including weekly and monthly reviews, consolidations, and ad hoc analyses Partner with GenAI leadership and cross functional teams to evaluate and execute key strategic and operational initiatives that scale the business multifold Conduct financial analyses and build business cases for new products, partnerships, and investments Collaborate with Accounting, and Corporate Finance to improve close, reporting, and planning cadences Continuously improve financial processes and systems to enhance scalability, forecast precision, and data visibility Ideally, you'd have: 4–6 years of experience in Strategic Finance, FP&A, or Business Operations, ideally within a high-growth technology company 2 years of investment banking experience at a top-tier firm Strong analytical and financial modeling skills; ability to translate complex data into actionable insights Excellent communication skills, with the ability to distill complexity into clear narratives for non-finance stakeholders Advanced proficiency in Excel, Google Sheets, and PowerPoint; strong command of financial modeling best practices Experience with SQL or Business Intelligence tools (e.g., Looker, Tableau) Familiarity with Anaplan, Adaptive Insights, or other planning systems Nice to haves: Bachelor’s degree in Finance, Accounting, Economics, Engineering, or a related field Prior experience supporting Product, Engineering, Growth, or Operations teams within a technology company Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. The base salary range for this full-time position in the location of San Francisco is: $176,400 — $220,500 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications. We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status. We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at [email protected]. Please see the United States Department of Labor's Know Your Rights poster for additional information. We comply with the United States Department of Labor's Pay Transparency provision . PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants’ needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
19h ago
Strategic Finance Manager, Gen AI
Scale AI· San Francisco, CA
We are building out the Finance team to help make data-driven and financially sound decisions for Scale. The Finance team drives strategic, financial, and operational decisions by partnering with the leadership team to make critical decisions across Scale. We’re looking for a high-performing, all-rounded finance athlete to join our team and support the rapidly growing Generative AI (GenAI) business. You’ll collaborate closely with Product, Operations, Growth, and Go-to-Market leaders to bring financial rigor to decision-making, develop actionable insights that drive strategy, and build scalable systems as the business expands. This role is ideal for someone with 4-6 years of experience in a fast-paced, high-growth environment. Someone who thrives in ambiguity, can juggle multiple workstreams, and brings a mix of analytical rigor, business acumen, and strong execution. You will: Own and evolve part of the GenAI financial forecasting model, driving accuracy and insight across planning cycles Support reporting and performance management, including weekly and monthly reviews, consolidations, and ad hoc analyses Partner with GenAI leadership and cross functional teams to evaluate and execute key strategic and operational initiatives that scale the business multifold Conduct financial analyses and build business cases for new products, partnerships, and investments Collaborate with Accounting, and Corporate Finance to improve close, reporting, and planning cadences Continuously improve financial processes and systems to enhance scalability, forecast precision, and data visibility Ideally, you'd have: 4–6 years of experience in Strategic Finance, FP&A, or Business Operations, ideally within a high-growth technology company 2 years of investment banking experience at a top-tier firm Strong analytical and financial modeling skills; ability to translate complex data into actionable insights Excellent communication skills, with the ability to distill complexity into clear narratives for non-finance stakeholders Advanced proficiency in Excel, Google Sheets, and PowerPoint; strong command of financial modeling best practices Experience with SQL or Business Intelligence tools (e.g., Looker, Tableau) Familiarity with Anaplan, Adaptive Insights, or other planning systems Nice to haves: Bachelor’s degree in Finance, Accounting, Economics, Engineering, or a related field Prior experience supporting Product, Engineering, Growth, or Operations teams within a technology company Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. The base salary range for this full-time position in the location of San Francisco is: $176,400 — $220,500 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Ernst & Young, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications. We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status. We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at [email protected]. Please see the United States Department of Labor's Know Your Rights poster for additional information. We comply with the United States Department of Labor's Pay Transparency provision . PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants’ needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
19h ago
Applied AI Claude Evangelist, Startups
Anthropic· San Francisco, CA
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role: You'll be the face of Anthropic in the startup ecosystem — driving awareness and activation among founders and their technical teams. Your focus is lighting up the startup developer community and getting builders excited about, onboarded onto, and actively building on top of the Claude Developer Platform. This role sits at the intersection of startup ecosystem development and hands-on technical activation. Your primary mandate is driving adoption across the startup funnel — turning ecosystem touchpoints into active, building developers on Anthropic's platform. You'll work closely with the Startups GTM team to convert ecosystem relationships — with VCs, accelerators, and founder communities — into active, building developers. Responsibilities: Drive Net New Logo Acquisition Through Developer Enablement Lead hands-on developer onboarding experiences at ecosystem events that convert startup founders and their engineers from first API interaction to committed platform adoption Build scalable enablement programs — hands-on workshops, build-a-thons, and technical office hours — that activate developers at VC and accelerator partner events Develop a playbook for turning ecosystem touchpoints (builder summits, founder days, VC partnerships) into measurable developer sign-ups and active usage Partner closely with the Startups GTM team to create seamless handoffs from relationship → activation → retention Create Compelling Technical Content Create high-quality technical content — tutorials, demo apps, and blog posts — that showcase how to build real products on Anthropic's API, with a focus on common startup use cases Develop code demos and working prototypes that showcase Claude's capabilities in ways that resonate with technical founders, particularly at ecosystem events and webinars Partner with the Product team to stay current on platform capabilities and ensure content reflects the latest developer tooling and best practices Lead Developer Programming at Startup Events Own developer-facing programming at Anthropic's builder summits and global startup activations Design and run hands-on technical sessions that move developers from curiosity to active building within a single event Build relationships with key technical communities — developer platforms, open-source contributors, and startup accelerator cohorts — to grow Anthropic's developer mindshare Represent Anthropic as a trusted technical resource in the startup ecosystem Partner Across GTM and Ecosystem Surface market signal and developer sentiment from the startup ecosystem back to internal teams Partner with GTM, Sales, and Marketing to create developer-focused activation campaigns and messaging for the startup ecosystem Work with the Startups Partnerships team to ensure VC and accelerator relationships translate into real developer engagement and usage metrics Collaborate closely with the Product team to stay aligned on platform strategy, content, and developer programs Define Success and Scale Define and track success metrics — tied to net new logos, developer activation, and ecosystem engagement — and build reporting that connects evangelist activity to business outcomes Create scalable processes and grow the team as the startup ecosystem expands You may be a good fit if you have: 7+ years of experience across a combination of founding/building startups and developer-facing roles (developer relations, evangelism, or ecosystem development) Experience as a technical founder, early startup employee, or operator who has lived the 0-to-1 journey and knows what it takes to go from idea to product-market fit The ability to write production-quality code, build compelling demos, and credibly engage with technical co-founders and their engineering teams Strong public speaking skills with comfort on stage, the ability to command a room at builder summits and ecosystem events, and deliver technical content that energizes an audience Experience building or scaling developer programs, communities, or enablement motions that drove measurable adoption Builder credibility that earns trust with founders, VCs, and accelerator communities. You've shipped products and can speak from experience Willingness to travel regularly and flexibility to support events that fall outside standard business hours, including evenings and weekends Deep enthusiasm for AI with hands-on experience building with LLMs. You stay close to the technology and genuinely care about building it responsibly The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $380,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago
Applied AI Claude Evangelist, Startups
Anthropic· San Francisco, CA
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role: You'll be the face of Anthropic in the startup ecosystem — driving awareness and activation among founders and their technical teams. Your focus is lighting up the startup developer community and getting builders excited about, onboarded onto, and actively building on top of the Claude Developer Platform. This role sits at the intersection of startup ecosystem development and hands-on technical activation. Your primary mandate is driving adoption across the startup funnel — turning ecosystem touchpoints into active, building developers on Anthropic's platform. You'll work closely with the Startups GTM team to convert ecosystem relationships — with VCs, accelerators, and founder communities — into active, building developers. Responsibilities: Drive Net New Logo Acquisition Through Developer Enablement Lead hands-on developer onboarding experiences at ecosystem events that convert startup founders and their engineers from first API interaction to committed platform adoption Build scalable enablement programs — hands-on workshops, build-a-thons, and technical office hours — that activate developers at VC and accelerator partner events Develop a playbook for turning ecosystem touchpoints (builder summits, founder days, VC partnerships) into measurable developer sign-ups and active usage Partner closely with the Startups GTM team to create seamless handoffs from relationship → activation → retention Create Compelling Technical Content Create high-quality technical content — tutorials, demo apps, and blog posts — that showcase how to build real products on Anthropic's API, with a focus on common startup use cases Develop code demos and working prototypes that showcase Claude's capabilities in ways that resonate with technical founders, particularly at ecosystem events and webinars Partner with the Product team to stay current on platform capabilities and ensure content reflects the latest developer tooling and best practices Lead Developer Programming at Startup Events Own developer-facing programming at Anthropic's builder summits and global startup activations Design and run hands-on technical sessions that move developers from curiosity to active building within a single event Build relationships with key technical communities — developer platforms, open-source contributors, and startup accelerator cohorts — to grow Anthropic's developer mindshare Represent Anthropic as a trusted technical resource in the startup ecosystem Partner Across GTM and Ecosystem Surface market signal and developer sentiment from the startup ecosystem back to internal teams Partner with GTM, Sales, and Marketing to create developer-focused activation campaigns and messaging for the startup ecosystem Work with the Startups Partnerships team to ensure VC and accelerator relationships translate into real developer engagement and usage metrics Collaborate closely with the Product team to stay aligned on platform strategy, content, and developer programs Define Success and Scale Define and track success metrics — tied to net new logos, developer activation, and ecosystem engagement — and build reporting that connects evangelist activity to business outcomes Create scalable processes and grow the team as the startup ecosystem expands You may be a good fit if you have: 7+ years of experience across a combination of founding/building startups and developer-facing roles (developer relations, evangelism, or ecosystem development) Experience as a technical founder, early startup employee, or operator who has lived the 0-to-1 journey and knows what it takes to go from idea to product-market fit The ability to write production-quality code, build compelling demos, and credibly engage with technical co-founders and their engineering teams Strong public speaking skills with comfort on stage, the ability to command a room at builder summits and ecosystem events, and deliver technical content that energizes an audience Experience building or scaling developer programs, communities, or enablement motions that drove measurable adoption Builder credibility that earns trust with founders, VCs, and accelerator communities. You've shipped products and can speak from experience Willingness to travel regularly and flexibility to support events that fall outside standard business hours, including evenings and weekends Deep enthusiasm for AI with hands-on experience building with LLMs. You stay close to the technology and genuinely care about building it responsibly The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $380,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago
People Research Scientist, Recruiting
Anthropic· New York City, NY | Seattle, WA; San Francisco, CA | New York City, NY
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role We are seeking a Recruiting Research Scientist to join our People Data Solutions team. You’ll be the research expert supporting our Recruiting organization, using rigorous scientific methods to advance our understanding of recruiting funnels, interview effectiveness, candidate experience, and recruiting capacity. This role sits at the intersection of organizational science, behavioral research, and people strategy – developing novel frameworks and conducting systematic research that drives evidence-based people decisions across our growing organization. This role offers the opportunity to make a significant impact on both our recruiting practices and the broader field of people science at a leading AI safety company. Responsibilities Research design & scientific inquiry Design and execute systematic research studies to answer fundamental questions about recruiting funnel health, assessment quality, candidate experience, and quality of hire Generate and test hypotheses about sourcing strategies, interview design, and selection decisions using rigorous experimental and quasi-experimental methods Conduct mixed-method research to understand what are the drivers and blockers to recruiting operations. Navigate research ethics considerations when studying candidate data, ensuring responsible research practices Selection & assessment research Design and execute validation studies to assess the quality of interviews and other selection tools Utilize psychometric techniques to analyze and improve interviewer calibration and rating consistency Lead investigative research into innovative approaches for candidate assessment Metrics design and governance Design the metrics framework for recruiting org health — defining the canonical KPIs, dimensions, and definitions that leadership uses to understand funnel performance, capacity, and hiring quality Establish the governance and definitional rigor that keeps metrics consistent across tools and reporting surfaces Analytical solution building Architect analytical solutions that convert research insights into actionable products, empowering stakeholders to execute data-driven scenario and strategic planning Quantify the adoption and downstream impact of deployed tools, driving iterative improvements Visualization & communication Build compelling visualizations and dashboards that make complex research findings accessible to diverse audiences Present research findings to senior leadership with clear, actionable recommendations Minimum Qualifications: Hold an advanced degree (Master’s or PhD) in I/O Psychology, Organizational Behavior, Statistics, Data Science, Economics, Behavioral Science, or a related research field Have experience with selection research, assessment validation, psychometrics, or recruiting funnel analytics Are comfortable working in the People Analytics tech stack and collaborating with data engineers Are proficient in SQL and Python/R, with experience in statistical analysis and machine learning Have experience with data visualization and can tell compelling stories with research findings Possess excellent communication skills and can influence stakeholders at all levels Thrive in ambiguity and can balance rigor with pragmatism Have a track record of challenging assumptions with data and changing long-held practices Can navigate sensitive topics diplomatically while maintaining analytical rigor Demonstrate intellectual humility and comfort with iterative discovery Use data to improve how organizations find, assess, and hire talent Preferred Qualifications: 5 + years of experience in research, people analytics, or related quantitative fields with demonstrated research methodology expertise Background in recruiting analytics specifically (not just general analytics) Experience running interview or assessment validation studies Experience building self-service analytics tools or dashboards Previous experience in high-growth technology companies or AI/ML organizations Familiarity with network analysis, machine learning, or advanced statistical methods Experience with BigQuery and modern data stack tools Experience with Greenhouse, Gem, ModernLoop, or similar recruiting tools The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $370,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago
People Research Scientist, Recruiting
Anthropic· New York City, NY | Seattle, WA; San Francisco, CA | New York City, NY
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role We are seeking a Recruiting Research Scientist to join our People Data Solutions team. You’ll be the research expert supporting our Recruiting organization, using rigorous scientific methods to advance our understanding of recruiting funnels, interview effectiveness, candidate experience, and recruiting capacity. This role sits at the intersection of organizational science, behavioral research, and people strategy – developing novel frameworks and conducting systematic research that drives evidence-based people decisions across our growing organization. This role offers the opportunity to make a significant impact on both our recruiting practices and the broader field of people science at a leading AI safety company. Responsibilities Research design & scientific inquiry Design and execute systematic research studies to answer fundamental questions about recruiting funnel health, assessment quality, candidate experience, and quality of hire Generate and test hypotheses about sourcing strategies, interview design, and selection decisions using rigorous experimental and quasi-experimental methods Conduct mixed-method research to understand what are the drivers and blockers to recruiting operations. Navigate research ethics considerations when studying candidate data, ensuring responsible research practices Selection & assessment research Design and execute validation studies to assess the quality of interviews and other selection tools Utilize psychometric techniques to analyze and improve interviewer calibration and rating consistency Lead investigative research into innovative approaches for candidate assessment Metrics design and governance Design the metrics framework for recruiting org health — defining the canonical KPIs, dimensions, and definitions that leadership uses to understand funnel performance, capacity, and hiring quality Establish the governance and definitional rigor that keeps metrics consistent across tools and reporting surfaces Analytical solution building Architect analytical solutions that convert research insights into actionable products, empowering stakeholders to execute data-driven scenario and strategic planning Quantify the adoption and downstream impact of deployed tools, driving iterative improvements Visualization & communication Build compelling visualizations and dashboards that make complex research findings accessible to diverse audiences Present research findings to senior leadership with clear, actionable recommendations Minimum Qualifications: Hold an advanced degree (Master’s or PhD) in I/O Psychology, Organizational Behavior, Statistics, Data Science, Economics, Behavioral Science, or a related research field Have experience with selection research, assessment validation, psychometrics, or recruiting funnel analytics Are comfortable working in the People Analytics tech stack and collaborating with data engineers Are proficient in SQL and Python/R, with experience in statistical analysis and machine learning Have experience with data visualization and can tell compelling stories with research findings Possess excellent communication skills and can influence stakeholders at all levels Thrive in ambiguity and can balance rigor with pragmatism Have a track record of challenging assumptions with data and changing long-held practices Can navigate sensitive topics diplomatically while maintaining analytical rigor Demonstrate intellectual humility and comfort with iterative discovery Use data to improve how organizations find, assess, and hire talent Preferred Qualifications: 5 + years of experience in research, people analytics, or related quantitative fields with demonstrated research methodology expertise Background in recruiting analytics specifically (not just general analytics) Experience running interview or assessment validation studies Experience building self-service analytics tools or dashboards Previous experience in high-growth technology companies or AI/ML organizations Familiarity with network analysis, machine learning, or advanced statistical methods Experience with BigQuery and modern data stack tools Experience with Greenhouse, Gem, ModernLoop, or similar recruiting tools The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $370,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago
People Research Scientist, People
Anthropic· New York City, NY | Seattle, WA; San Francisco, CA | New York City, NY
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the Role: We are seeking a People Research Scientist to join our People Data Solutions team. You’ll be the research expert supporting our broader People organization, using rigorous scientific methods to advance our understanding of the employee experience, manager effectiveness, organizational health, and workforce dynamics. This role sits at the intersection of organizational science, behavioral research, and people strategy – developing novel frameworks and conducting systematic research that drives evidence-based people decisions across our growing organization. This role offers the opportunity to make a significant impact on both our people practices and the broader field of people science at a leading AI safety company. Responsibilities: Research Design & Scientific Inquiry Design and execute systematic research studies to answer fundamental questions about employee experience, manager effectiveness, and organizational health Generate and test hypotheses about people programs, employee behavior, and workforce outcomes using rigorous experimental and quasi-experimental methods Conduct longitudinal studies tracking employee cohorts to understand long-term workforce trends and the impact of people initiatives Perform meta-analyses of people interventions across the industry to identify best practices and knowledge gaps Navigate research ethics considerations when studying employee data, ensuring responsible research practices Employee listening & survey research Design, analyze, and iterate on employee listening programs including engagement surveys, pulse surveys, and lifecycle surveys Apply psychometric methods to validate survey instruments and ensure measurement reliability Translate survey findings into strategic recommendations that drive meaningful organizational change Manager research & organizational effectiveness Conduct research on manager behaviors, competencies, and their impact on team outcomes Build measurement frameworks to evaluate and improve manager effectiveness programs Study organizational dynamics including team composition, collaboration patterns, and their relationship to performance outcomes Visualization & communication Build compelling visualizations and dashboards that make complex research findings accessible to diverse audiences Present research findings to senior leadership with clear, actionable recommendations Develop self-service analytics capabilities that empower People team partners Minimum Qualifications: Hold an advanced degree (Master’s or PhD) in I/O Psychology, Organizational Behavior, Statistics, Data Science, Economics, Behavioral Science, or a related research field Have experience with experimental design, hypothesis testing, longitudinal research methods, and causal inference Are comfortable working in the People Analytics tech stack and collaborating with data engineers Are proficient in SQL and Python/R, with experience in statistical analysis and machine learning Have experience with survey design, psychometric methods, and employee listening programs Have experience with data visualization and can tell compelling stories with research findings Possess excellent communication skills and can influence stakeholders at all levels Thrive in ambiguity and can balance rigor with pragmatism Have a track record of challenging assumptions with data and changing long-held practices Can navigate sensitive topics diplomatically while maintaining analytical rigor Demonstrate intellectual humility and comfort with iterative discovery Use data to improve how organizations develop and support their people Preferred Qualifications: 5+ years of experience in research, people analytics, or related quantitative fields with demonstrated research methodology expertise Background in people analytics specifically (not just general analytics) Experience designing and analyzing employee engagement or pulse surveys Deep knowledge of manager effectiveness research and organizational science Experience building self-service analytics tools or dashboards Understanding of employee lifecycle metrics and people KPIs Previous experience in high-growth technology companies or AI/ML organizations Familiarity with network analysis, NLP, or advanced statistical methods Experience with BigQuery and modern data stack tools Experience with Qualtrics Experience with Workday The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $370,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago
People Research Scientist, People
Anthropic· New York City, NY | Seattle, WA; San Francisco, CA | New York City, NY
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the Role: We are seeking a People Research Scientist to join our People Data Solutions team. You’ll be the research expert supporting our broader People organization, using rigorous scientific methods to advance our understanding of the employee experience, manager effectiveness, organizational health, and workforce dynamics. This role sits at the intersection of organizational science, behavioral research, and people strategy – developing novel frameworks and conducting systematic research that drives evidence-based people decisions across our growing organization. This role offers the opportunity to make a significant impact on both our people practices and the broader field of people science at a leading AI safety company. Responsibilities: Research Design & Scientific Inquiry Design and execute systematic research studies to answer fundamental questions about employee experience, manager effectiveness, and organizational health Generate and test hypotheses about people programs, employee behavior, and workforce outcomes using rigorous experimental and quasi-experimental methods Conduct longitudinal studies tracking employee cohorts to understand long-term workforce trends and the impact of people initiatives Perform meta-analyses of people interventions across the industry to identify best practices and knowledge gaps Navigate research ethics considerations when studying employee data, ensuring responsible research practices Employee listening & survey research Design, analyze, and iterate on employee listening programs including engagement surveys, pulse surveys, and lifecycle surveys Apply psychometric methods to validate survey instruments and ensure measurement reliability Translate survey findings into strategic recommendations that drive meaningful organizational change Manager research & organizational effectiveness Conduct research on manager behaviors, competencies, and their impact on team outcomes Build measurement frameworks to evaluate and improve manager effectiveness programs Study organizational dynamics including team composition, collaboration patterns, and their relationship to performance outcomes Visualization & communication Build compelling visualizations and dashboards that make complex research findings accessible to diverse audiences Present research findings to senior leadership with clear, actionable recommendations Develop self-service analytics capabilities that empower People team partners Minimum Qualifications: Hold an advanced degree (Master’s or PhD) in I/O Psychology, Organizational Behavior, Statistics, Data Science, Economics, Behavioral Science, or a related research field Have experience with experimental design, hypothesis testing, longitudinal research methods, and causal inference Are comfortable working in the People Analytics tech stack and collaborating with data engineers Are proficient in SQL and Python/R, with experience in statistical analysis and machine learning Have experience with survey design, psychometric methods, and employee listening programs Have experience with data visualization and can tell compelling stories with research findings Possess excellent communication skills and can influence stakeholders at all levels Thrive in ambiguity and can balance rigor with pragmatism Have a track record of challenging assumptions with data and changing long-held practices Can navigate sensitive topics diplomatically while maintaining analytical rigor Demonstrate intellectual humility and comfort with iterative discovery Use data to improve how organizations develop and support their people Preferred Qualifications: 5+ years of experience in research, people analytics, or related quantitative fields with demonstrated research methodology expertise Background in people analytics specifically (not just general analytics) Experience designing and analyzing employee engagement or pulse surveys Deep knowledge of manager effectiveness research and organizational science Experience building self-service analytics tools or dashboards Understanding of employee lifecycle metrics and people KPIs Previous experience in high-growth technology companies or AI/ML organizations Familiarity with network analysis, NLP, or advanced statistical methods Experience with BigQuery and modern data stack tools Experience with Qualtrics Experience with Workday The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $275,000 — $370,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process.
21h ago