D

Senior Site Reliability Engineer

Datavail

Remote, Remote, Colombia Full-time June 29, 2026

Opportunity Description

Responsibilities

  • Define and maintain SLIs, SLOs, and monitor alignment and error‑budget usage.
  • Lead incident response and post‑mortems, and implement corrective measures.
  • Automate operations tasks via tooling such as auto‑remediation and scaling rules.
  • Build, improve, and maintain CI/CD pipelines, canary deployments, and blue/green strategies.
  • Lead technical discussions with customers to align on reliability, scalability, and performance requirements.
  • Drive continuous platform improvements across the service lifecycle, including architecture, monitoring, and operational processes.
  • Implement and extend observability systems: metrics, tracing, and log aggregation.
  • Optimize performance and cost by tuning cloud services, autoscaling, and resource rightsizing.
  • Design, deploy, and operate containerized workloads using Docker and Kubernetes in production environments.
  • Collaborate wi...
Full-time Redes y sistemas

Interested in this opportunity? Apply now through Expertini.

Apply for this Position