Senior Staff Engineer - Site Reliability

2024-09-13
Colombia
Nagarro
Company DescriptionWe are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!Job DescriptionExperienced L3 SRE engineer based on business-critical SaaS application.Capacity to L3 across the full stack including infra backend and front-end, before escalation to engineering business unit.Capacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategy.Capacity to work under business pressure for business critical applications.Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.QualificationsMust have Skills: Kubernetes (Expert), Github Actions, Terraform (Expert), and AWS.Capacity to communicate accordingly.Experience with incident and problem management.Experience with multitenant applications.Solid understanding of networking concepts (TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.Python, react/next - Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.Experience with AWS, particularly EKS, serverless, queue & various databases.Solid knowledge Kubernetes.Additional Information