Opportunity Description
About The Role
Join the inference model team dedicated to bring up the state-of-the-art models, numerically validating and accelerating new model ideas on wafer-scale hardware. You will prototype architectural tweaks, build performance-eval pipelines, and turn hard numbers into changes that land in production. Key Responsibilities
Prototype and benchmark cutting-edge ideas: new attentions, MoE, speculative decoding, and many more innovations as they emerge. Develop agent-driven automation that designs experiments, schedules runs, triages regressions, and drafts pull-requests. Work closely with compiler, runtime, and silicon teams: unique opportunity to experience the full stack of software / hardware innovation. Keep pace with the latest open- and closed-source models; run them first on wafer scale to expose new optimization opportunities. Skills And Qualifications
3 + years building high-performance ML or systems software. Solid grounding in Transformer math...
Join the inference model team dedicated to bring up the state-of-the-art models, numerically validating and accelerating new model ideas on wafer-scale hardware. You will prototype architectural tweaks, build performance-eval pipelines, and turn hard numbers into changes that land in production. Key Responsibilities
Prototype and benchmark cutting-edge ideas: new attentions, MoE, speculative decoding, and many more innovations as they emerge. Develop agent-driven automation that designs experiments, schedules runs, triages regressions, and drafts pull-requests. Work closely with compiler, runtime, and silicon teams: unique opportunity to experience the full stack of software / hardware innovation. Keep pace with the latest open- and closed-source models; run them first on wafer scale to expose new optimization opportunities. Skills And Qualifications
3 + years building high-performance ML or systems software. Solid grounding in Transformer math...
Interested in this opportunity? Apply now through Expertini.
Apply for this Position