Senior Site Reliability Engineer (SRE) – Technical Leader, Kubernetes Platform (IRAP)
Cisco · Bangalore · 10+ yrs experience · Posted 2026-06-18
Tech stack: GitHub Actions, Go, Jenkins, Kubernetes, Linux, Python, Terraform
About the role
The Meraki SRE Platform Engineering team builds and operates the infrastructure that powers Meraki’s cloud. We focus on delivering reliable, scalable, and simple platforms that enable product teams to move quickly while maintaining a strong and secure operating environment.
Responsibilities:
- At Cisco Meraki, we build technology that simply works—reliable, secure, and easy to use. We’re looking for a
- Site Reliability Engineer (SRE) - Technical Leader to help us design, operate, and scale a Kubernetes-based platform supporting various environments.
- This role sits at the intersection of software engineering and infrastructure. You’ll work closely with engineers across the stack to ensure our platform is resilient, observable, compliant, and developer-friendly—without slowing teams down.
- Design, build, and operate production-grade Kubernetes platforms in a regulated and non-regulated environments.
- Improve system reliability through automation, thoughtful design, and continuous iteration
- Define and drive SLOs, SLIs, and error budgets to guide reliability decisions
- Build and evolve CI/CD pipelines that are secure, scalable, and easy to use
- Implement robust observability (metrics, logs, traces) to make systems understandable and actionable
- Reduce operational toil by automating repetitive processes and improving workflows
- Partner with security and compliance teams to meet Compliance requirements without sacrificing developer velocity
- Support audit processes, including documentation, controls implementation, and audit readiness
- Participate in on-call rotations supporting customer requests and paging alerts
- Participate in incident response, blameless postmortems, and continuous improvement efforts
- Help shape a platform that engineers enjoy using
Qualifications:
- 10+ years of experience in SRE, DevOps, or infrastructure engineering
- Strong experience running Kubernetes in production (EKS, AKS, GKE, or upstream)
- Solid understanding of cloud infrastructure, Linux systems, and networking fundamentals
- Experience with Infrastructure as Code (Terraform preferred)
- Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, ArgoCD)
- Proficiency in scripting or programming (Python, Go)
- Experience building or operating observability platforms (Prometheus, Grafana, OpenTelemetry, ELK)
- Working knowledge of compliance frameworks (e.g., PCI, ISO)
- Demonstrated ability to influence technical direction and drive cross-functional initiatives across engineering, security, and operations teams
- Experience mentoring engineers and providing technical leadership in SRE, platform engineering, or infrastructure teams
Qualifications
- 10+ years of experience in SRE, DevOps, or infrastructure engineering
- Strong experience running Kubernetes in production (EKS, AKS, GKE, or upstream)
- Solid understanding of cloud infrastructure, Linux systems, and networking fundamentals
- Experience with Infrastructure as Code (Terraform preferred)
- Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, ArgoCD)
- Proficiency in scripting or programming (Python, Go)
- Experience building or operating observability platforms (Prometheus, Grafana, OpenTelemetry, ELK)
- Working knowledge of compliance frameworks (e.g., PCI, ISO)
- Demonstrated ability to influence technical direction and drive cross-functional initiatives across engineering, security, and operations teams
- Experience mentoring engineers and providing technical leadership in SRE, platform engineering, or infrastructure teams
Responsibilities
- At Cisco Meraki, we build technology that simply works—reliable, secure, and easy to use.
- We’re looking for a
- Site Reliability Engineer (SRE)
- Technical Leader to help us design, operate, and scale a Kubernetes-based platform supporting various environments.
- This role sits at the intersection of software engineering and infrastructure.
- You’ll work closely with engineers across the stack to ensure our platform is resilient, observable, compliant, and developer-friendly—without slowing teams down.
- Design, build, and operate production-grade Kubernetes platforms in a regulated and non-regulated environments.
- Improve system reliability through automation, thoughtful design, and continuous iteration
- Define and drive SLOs, SLIs, and error budgets to guide reliability decisions
- Build and evolve CI/CD pipelines that are secure, scalable, and easy to use
- Implement robust observability (metrics, logs, traces) to make systems understandable and actionable
- Reduce operational toil by automating repetitive processes and improving workflows
- Partner with security and compliance teams to meet Compliance requirements without sacrificing developer velocity
- Support audit processes, including documentation, controls implementation, and audit readiness
- Participate in on-call rotations supporting customer requests and paging alerts
- Participate in incident response, blameless postmortems, and continuous improvement efforts
- Help shape a platform that engineers enjoy using