Engineering Manager, Infrastructure & DevOps

2024-10-31
Canada
Coconut Software
​​We are looking for an experienced Engineering Manager for our cloud Infrastructure, SRE (Site Reliability Engineering), & DevOps Team. We are keen on leaders with a learning mentality. This high-performing individual needs to have excellent communication and collaboration skills as you will work with senior members of teams across the entire company. Reporting to the VP, Engineering, you will work closely with the CTO, Principal and Staff Developers, and your team to create the strategic plan for infrastructure and devops. You will plan and prioritize work so that we have a stable, efficient, observable and resilient technology environment that allows the company to meet their technology and business goals. We’re looking for a manager with experience leading and managing a team of developers, is proactive and can dive into projects while understanding the importance of open and clear communication within and between all teams.
This position is ideal if you enjoy being a player/coach and want to focus on optimizing efficiency and collaborating with the engineering team to ensure the development engine runs smoothly. As the Manager of this team, you will bring technical knowledge, defining and implementing CloudOps, DevOps, and SRE best practices. Your team will be responsible for managing and optimizing Coconut’s DevOps and infrastructure, (based on AWS, Kubernetes, and Terraform), and building tools to automate the management of a stable, efficient, observable, and resilient technology environment.
**Note: this position will participate in our On-Call team rotation roster
YOU ARE FIRED UP TO
Demonstrate Team Leadership

Lead by example - act in accordance with our CHEERS values
Mentor, coach and inspire a team of DevOps, Infrastructure and SRE professionals
Foster a collaborative and high-performance work environment
Hire and train team members
Be accountable for the Infrastructure and DevOps roadmap and the results the team attains
Work with the Principal DevOps Engineer to set strategic plans and priorities for this function

Oversee Infrastructure and Site Reliability

Work with your team to design, implement, and manage a cloud-based infrastructure ecosystem for scalability and reliability
Ensure best practices for infrastructure as code (IaC) and configuration management are applied
Work closely with the application development teams to ensure a manageable migration into a secure and reliable product environment and on implementing new tools
Automation and CI/CD:

Develop and maintain automated deployment pipelines
Promote continuous integration and delivery (CI/CD) practices
Design and develop automation and processes to enable teams to deploy, manage, configure, scale and monitor applications


Monitoring and Alerting:

Ensure robust monitoring to proactively identify and resolve issues
Configure and manage alerting systems for real-time status and incident response


Reliability Engineering:

Define and measure service level objectives (SLOs) and service level indicators (SLIs) to ensure system reliability
Lead incident response and post-incident reviews to improve system resilience
Develop innovative and technical tooling to improve production stability and enable faster recovery


Security and Compliance:

Collaborate with security team to implement best practices for securing infrastructure and applications
Ensure compliance with industry standards and regulations


Resource Optimization:

Optimize resource utilization to reduce costs while maintaining performance and reliability
Monitor & report on hosting & tooling costs


Documentation and Training:

Maintain comprehensive documentation of systems, processes, and procedures
Provide training and knowledge sharing within the team



WHAT YOU BRING TO THE TEAM

Proven experience in managing/leading a DevOps/SRE/Infrastructure team in a fast-paced environment
Expertise in cloud platforms and infrastructure management, preferably AWS and Kubernetes
Experience with provisioning, vendor management, and monitoring resources in a cloud based environment
Experience configuring and managing data sources like MySQL, Postgres, Redis
System configuration experience with automation tools such as Puppet/Chef/Ansible
Proficiency in automation and CI/CD tools such as Spinnaker, CircleCI, Travis CI, or GitLab CI/CD
Experience with containerization and orchestration techniques and tools (e.g., Docker, Kubernetes)
Experience with infrastructure as code tools, such as Terraform
Experience leading & analyzing complex application, database, network, and OS issues for customer-facing systems in a high-uptime environment
Experience with monitoring and alerting tools (e.g. DataDog, Sentry, OpsGenie)
Experience with Perl/Python/Java/Bash scripting
Experience working with large enterprise customers bases
Experience reporting on key metrics, costing, tooling to the organization and making recommendations for improvements
Excellent problem-solving, collaboration, and communication skills
Effective at nurturing relationships and managing multiple stakeholders across different teams
Strong project management, leadership and cost management abilities

Our Investment in You:

“Cabana Days” - our version of a flexible work week! To enable our employees to do their best work, offering flexibility to prioritize what is important and to take time needed for rest and rejuvenation when possible based on business and operational needs.
Ability to do your job in a supported, but still flexible environment;
Supported professional development, learning & career opportunities - be supported in your growth journey!
Regular 1:1 coaching with your leader and regular connection to a passionate executive team
Work in a team big enough for growth but lean enough to make a real impact

A full range of benefits to keep you happy & healthy;

Competitive Salaries - we pay fairly based on experience and expertise, not your ability to negotiate!
Health & Dental Benefits, Virtual Care, & Disability top up - all starting from day 1!
Virtual mental health and EAP platform
WealthSimple GRSP & Matching
Annual Wellness Benefit ($1000 per year)
Opportunity to work remote - anywhere in Canada!
Employee Options - everyone shares in our success!
Internet Subsidy on each paycheck
Tiki Bucks Incentive Program - everyone is entitled to earn bonuses!
A People First Company - 4.6 rating on Glassdoor
Recently named #4 on the Top 10 Best Workplaces in Canada

Who we are, and what we do:
MissionMatch customers with the right expert, at the right time, so no opportunity is lost.
ValuesCollaboration. Honesty. Empathy. Elevate. Resilience. Service Excellence.
Coconut Software makes it effortless for customers to connect with their bank or credit union. Our appointment scheduling, queue management, and video banking solutions are used by leading financial institutions across North America, including RBC, Arvest Bank, Vancity, and Rogue Credit Union. Organizations that use Coconut benefit from a seamless customer experience that improves NPS, reduces wait times, and increases conversion rates.
To date we have raised close to 40M and have been doubling revenue year after year. The team at Coconut has ambitious growth plans to continue to scale the business to new heights by owning the North American market and delivering innovative solutions to our customers.
Coconut has a company culture that is best in class. We foster a community that is unconditionally inclusive, and in return ask that our people contribute their differing perspectives, ideas and experiences for one common purpose: to advance the way people live and work in an environment of diversity, equity and inclusion and workplace belonging. Some recent awards we're proud of include:

Coconut Software is committed to treating all people in a way that allows them to maintain their dignity and independence. We believe in integration and equal opportunity. We are committed to meeting the needs of people with disabilities in a timely manner, and will do so by preventing and removing barriers to accessibility and meeting accessibility requirements under the Accessibility for Ontarians with Disabilities Act, 2005.