Multiple positions: US, UK, India
This is a highly visible global role based in HCL Cloud Native Labs where you will be responsible for guiding the Site Reliability Engineering (SRE) journey for the world’s largest enterprises.
We seek a technical heavyweight and thought-leader, comfortable in any of the following domains:
- Site Reliability Engineering (SRE)
- Platform Reliability Engineering (PRE)
- Network Reliability Engineering (NRE)
You will be acting as a consultant and advisory engineer. You will act as technical authority, guiding the SRE journey, implementing SRE within our Labs and across client organisations.
- Showcase world-class SRE techniques and implementations
- Help clients drive organisational change enabled via SRE and associated practices.
- Act as Trusted Advisor, guiding senior technical leaders as they implement SRE and similar approaches.
- Get deeply involved in hands-on engineering tasks, automating complex Cloud Native solutions.
- Coach modern SRE techniques. Help experienced teams adopt Cloud Native skills.
- Perform a thought-leader role, advancing the art-of-the-possible, author papers, be passionate about the domain and participate in industry dialogue.
What we are looking for:
- An experienced SRE working in ‘Cloud Native’ environments.
- Deep understanding of SRE practices, hands-on skills with full-stack experience
- Developer mindset, software engineering approach to Operations with an acute focus on automation.
- Proficiency applying SRE metrics like SLI, SLO, SLAs, Error Budgets etc.
- Demonstrable development experience using modern languages (e.g. Python / Java / Go) – ability to code automation using professional software engineering approaches.
- Hands-on experience in private, hybrid and multi-cloud environments.
- IAC experience (writing and debugging) with tools like Terraform or Ansible, and deeply comfortable with Git and release management tools.
- Container orchestration platforms such as Kubernetes, Docker or OpenShift
- Dashboarding / alerting experience with monitoring solutions like Splunk / ELK / Prometheus / Grafana or similar
- Strong familiarity with public cloud providers like GCP / AWS or Azure
- Consultative mindset, experience acting in an advisory role.
- Superior written and verbal communication skills.
- Ability and willingness to mentor junior staff and share knowledge.
- Advanced problem-solving experience
Additional preferred experience:
- 5+ years of SRE/DevOps Consulting
- Deployment and operational support of highly available, large-scale & reliable Cloud Native applications.
- Experience with multiple programming languages, such as: Java, C#, Python, Node.js and Go.
- Architectural, Engineering or DevOps Certification with one or more Public Cloud providers (AWS, Azure, GCP).
- Coaching / Advisory experience. Helping Developers and Operations professionals master Reliability Engineering or DevOps techniques particularly in relation to Cloud Native software applications.
- Extra curricula engineering passion (e.g. active open source contributor).
- Design and implementation of GitOps or similar approaches to Infrastructure-as-Code (IaC).
- Experience with designing and building Cloud Native applications using Microservices, Container-based and Serverless technologies