Site Reliability Engineer

2024-08-27
India
Forefront Technologies International Inc.
Key Responsibilities:- Maintain and enhance the reliability, availability, and performance of large-scale distributed systems.- Automate deployment, monitoring, and management of production systems. - Implement and manage CI/CD pipelines for software delivery.- Collaborate with software engineers to design, build, and manage scalable and resilient infrastructure.- Troubleshoot complex system issues, identify root causes and implement long-term solutions.- Monitor system performance and optimize configurations for better performance and cost efficiency.- Implement security best practices and ensure compliance with industry standards.RequirementsRequired Skills:- Proficiency in cloud platforms (AWS, Google Cloud, or Azure) and containerization technologies like Docker and Kubernetes.- Strong scripting and automation skills using Python, Bash, or similar languages.- Experience with infrastructure as code (IaC) tools such as Terraform or Ansible.- Deep understanding of monitoring and logging tools (Prometheus, Grafana, ELK Stack).- Knowledge of database management (SQL/NoSQL) and networking fundamentals.- Experience with CI/CD tools like Jenkins, GitLab CI, or CircleCI.- Strong problem-solving skills and experience in troubleshooting large-scale systems.Education:- A degree in Computer Science, Engineering, or a related field from a recognized institution.- Ideally, 5-10 years of experience in a similar role at a product company.