A fantastic opportunity for a Senior Site Reliability Engineer has come up with an exciting Fintech who are growing rapidly.
This is a great opportunity for a candidate to work with industry specialists and work in a fun, fast paced environment.
Responsibilities
- Build automation tools for system health
- Ensure the systems are well documented, monitored and highly fault tolerant.
- Define and execute SRE best practices
- Responsible for critical services ensuring our services are fast, highly available, scalable, and able to withstand unprecedented increases in load
- Expand and lead a 24x7 global team
- Manage availability, latency, scalability and efficiency of Trading and Exchange systems.
- Ensuring production environments are stable and taking proactive actions to avoid any issues/outages
- Maintain and expand monitoring and observability capabilities to ensure service quality and availability targets are met.
Requirements
- Strong knowledge in all aspects of designing, developing, managing large real-time 24x7 financial systems.
- Prior successful experience as a systems performance or site/systems reliability engineer.
- Expert in Linux
- Experience of Scripting (Bash/Python)
- Hands on experience with Terraform and Ansible
- Experience working on AWS