Opportunity Description
Elevate your career at Cohere as a Software Engineer specializing in HPC and GPU infrastructure. Contribute your deep knowledge of ML frameworks and Kubernetes to enhance AI capabilities.
This role within the internal infrastructure team centers around developing the tools necessary to train and serve Cohere's advanced models. You'll collaborate with AI researchers to ensure that the infrastructure meets cutting-edge demands while emphasizing observability and performance. Your expertise will help deploy and manage superclusters that empower innovative AI applications across the cloud.
Key Responsibilities:
• Deploy and manage distributed GPU/TPU clusters efficiently
• Collaborate with cloud providers to enhance infrastructure performance
• Diagnose and resolve complex system issues swiftly
• Create user-friendly tools for AI researchers
• Promote infrastructure best practices throughout the organization
Requirements:
• Proven experience in ML/HPC environment...
This role within the internal infrastructure team centers around developing the tools necessary to train and serve Cohere's advanced models. You'll collaborate with AI researchers to ensure that the infrastructure meets cutting-edge demands while emphasizing observability and performance. Your expertise will help deploy and manage superclusters that empower innovative AI applications across the cloud.
Key Responsibilities:
• Deploy and manage distributed GPU/TPU clusters efficiently
• Collaborate with cloud providers to enhance infrastructure performance
• Diagnose and resolve complex system issues swiftly
• Create user-friendly tools for AI researchers
• Promote infrastructure best practices throughout the organization
Requirements:
• Proven experience in ML/HPC environment...
Interested in this opportunity? Apply now through Expertini.
Apply for this Position