Netflix Job – Site Reliability Engineer

Website Netflix

Job Description:

The Critical Operations and Reliability Engineering team’s goal is to drive customer joy by thoughtfully managing risk and minimizing impact across Netflix. We do this through cross-functional engagement with other engineering teams, managing issues when they happen, as well as promoting reliability and resilience practices throughout the organization.

Job Responsibilities:

  • Incident escalation & on-call rotation
  • Engage with product teams to diagnose and correct operational surprises
  • The ability to develop alignment to cultivate relationships and driving impact
  • Drive incidents to resolution by collaborating with multiple engineering teams
  • Increase our reliability through an automation focused mindset to solving problems
  • Collaboration, continuous improvement, and iteration as the path forward
  • Develop deeper insights into the quality of experience for our customers
  • Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
  • Curiosity about how complex socio-technical systems successfully operate at scale when failure is inevitable
  • Analyze complex systems from a reliability and resilience perspective
  • Comfort with being uncomfortable in ambiguous situations
  • A desire to grow expertise, inform, and educate others
  • Form and maintain relationships with internal and external partners
  • Identify sources of instability in distributed systems and drive operational excellence

Job Requirements:

  • Development with Python, Go, Java, or JavaScript/Node.js
  • Involvement with incident management and response
  • Knowledge of cloud platforms like AWS and microservices architecture

Job Details:

Company: Netflix

Vacancy Type: Full Time

Job Location: Los Gatos, CA, US

Application Deadline: N/A

Apply Here

 Report Job