Job Description
A dynamic and challenging opportunity has arisen for a Lead Site Reliability Engineer to join our Enterprise Infrastructure team in Galway. This is a permanent role with excellent benefits and career progression opportunities.
Key Responsibilities
* Define and execute a comprehensive reliability and observability strategy, ensuring systems are always available to customers.
* Troubleshoot engineering issues across hardware, software, network, applications, and cloud service providers.
* Coach and mentor peer SREs and development teams on building highly available systems.
* Lead production bridges across teams during major incidents, taking hands-on responsibility.
* Conduct thorough post-mortem reviews, focusing on technical root cause analysis, observability, and automation enhancements.
Requirements
* Bachelor's degree (or higher) in a technology-related field, such as Engineering or Computer Science.
* Extensive experience deploying and supporting highly distributed multi-tiered systems at scale.
* Practical experience with Public Cloud platforms, preferably AWS or Azure.
* Proficiency with EKS, AKS, or Rancher Kubernetes Service for container orchestration.
* Experience with distributed architectures, including microservices, containerized services, and serverless architectures.
* Strong hands-on Kubernetes skills.
* Programming experience in compiled/OOP languages, such as C# or Java, and scripting languages, like JavaScript or Python.
* Proven ability to maintain scalability and resiliency in complex environments.
* Familiarity with modern monitoring tools, such as Datadog or Prometheus.
* Technical and operational leadership with the ability to handle production incidents effectively.
About the Role
This is an exciting opportunity to be part of a vibrant team that values collaboration and continuous improvement. You will work in an environment where your contributions directly impact the reliability of critical systems. If you're passionate about driving reliability and resilience in high-scale environments, we want to hear from you.