Site Reliability Engineer 2

2024-09-20
USA
Conversica
As a Site Reliability Engineer on the Conversica team, you will utilize software and systems engineering to implement resilient production systems. We use common DevOps tools like Terraform, Gitlab, Gitlab Pipelines, Kubernetes (EKS) and AWS as our hosting platform. We’re looking for a technically curious Site Reliability Engineer who thrives on driving innovation & best practices within a fast-paced, highly dynamic environment. You should be a strong multi-tasker and a highly collaborative teammate. The ideal candidate has a strong passion for technology, continuous improvement, stability, and transforming platforms. This position is remote.ResponsibilitiesEngage in and improve the life cycle of services—from inception and design, through deployment, operation, and refinement over timeEngage in and help to continuously improve our team processes including our DevSecOps team working agreementSupport services before launch through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and deployment reviewParticipate in an on-call rotation with a sense of urgency and contribute to the continuous improvement of on-call (reduction in after-hour alerts etc)Maintain services once they are live by measuring and monitoring availability, latency, and overall system health based on SLAPractice sustainable incident response and participate in our blameless postmortem process Develop and maintain effective instrumentation of monitoring tools & dashboardsHelp improve and maintain high service up-time using AWS while developing and evangelizing company-wide standards for servicesBuild and scale highly available, distributed services with high-quality of service for customersAssist in troubleshooting failures and performance issues across all services, while suggesting and applying preventive measuresMaintain infrastructure owned by the DevSecOps team, EKS clusters, RDS Aurora DB Clusters, etcSupport Conversica Development teams to allow them to focus on roadmap initiativesQualificationsBS degree in Computer Science / Engineering or related technical field involving coding or equivalent practical experience3+ years of managing distributed SaaS systems in public and private cloud environments on AWS3+ years of experience in Kubernetes / Docker environment3+ years of experience with Unix / Linux system administration3+ years of practical experience building continuous delivery pipelinesFamiliarity with at least one of the following: Python, Go, PHP, Ruby, Java, C, C++Familiarity with algorithms, data structures, complexity analysis, and software designInterest in designing, analyzing, and troubleshooting large-scale distributed systemsA systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and driveHighly analytical, detail-oriented, with the ability to work with complex logic to debug & optimize code and automate routine tasks while working under pressure to meet tight deadlinesExperience with MySQL, AWS Aurora, or other RDBsExperience supporting applications according to privacy by design and security by design principlesExperience with configuration management tools like Terraform, Ansible, or Puppet is requiredExperience with full-stack development is preferredBring a growth mindset, customer orientation, and a bias for automationTeam communication and collaboration are both critical traits for this roleOur compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $110,000/year to $145,000/year. Pay is based on a number of factors including market location and job-related knowledge, skills, and experience.Conversica offers comprehensive health, dental, and vision benefits, flex time PTO, 401k plus company match, and equity. Further details can be provided upon request.