Opportunity Description
What you’ll do
Leadership & Strategy
- Define and drive the vision for resilience engineering at Affirm, with a focus on production load testing and chaos engineering as first‑class engineering practices.
- Lead and mentor a team of engineers building platforms and tooling for safe production experimentation.
- Partner with infrastructure, product, and security leadership to embed resilience validation into the software development lifecycle.
- Establish best practices for safely testing system limits and failure scenarios in production.
Systems & Operations
- Own the design and evolution of platforms that enable safe, controlled production load testing and fault injection.
- Ensure strong safeguards are in place, including isolation boundaries, approval workflows, and automated rollback mechanisms to protect real users.
- Build systems that provide end‑to‑end observability, traceability, an...
Interested in this opportunity? Apply now through Expertini.
Apply for this Position