Direct message the job poster from Solas IT RecruitmentSenior IT Recruitment Consultant @ Solas IT Recruitment | ERF CertRPMy Client is an innovative and rapidly growing SaaS company that delivers cutting-edge solutions and are seeking a talented and driven Site Reliability Engineer (SRE) to join our growing engineering team. This is an exciting opportunity to be part of a high-impact team focused on ensuring the availability, scalability, and performance of our platform in a fast-paced and dynamic environment.Key Responsibilities:System Reliability: Ensure the reliability, availability, and performance of our SaaS platform by developing and maintaining automated monitoring, alerting, and incident response systems.Automation & Tooling: Automate manual processes and optimize operational workflows to reduce overhead and improve efficiency. Build tools to manage infrastructure at scale.Capacity Planning & Scaling: Plan and execute scaling strategies, ensuring that infrastructure can handle growth and demand spikes without impacting user experience.Incident Management: Lead the response to incidents, perform root cause analysis (RCA), and put in place preventive measures to reduce recurring issues.Collaboration: Work closely with Development, QA, and Operations teams to build processes and solutions that optimize the balance between development velocity and system reliability.Continuous Improvement: Help drive the adoption of best practices across engineering teams, improve our deployment pipelines, and ensure systems are secure, highly available, and well-documented.Performance Optimization: Monitor and optimize system performance, identify bottlenecks, and implement effective solutions.Requirements:3+ years of experience in a Site Reliability Engineering, DevOps, or similar role in a SaaS environment or large-scale distributed systems.Proficiency in cloud platforms such as AWS, Azure, or GCP.Strong experience with containerization technologies (Docker, Kubernetes).Proficient in scripting and automation (e.g., Python, Bash, Go).Familiarity with CI/CD pipelines and related tools (e.g., Jenkins, GitLab, CircleCI).Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, ELK stack).Knowledge of infrastructure-as-code tools (e.g., Terraform, CloudFormation).Familiarity with incident response, postmortems, and continuous improvement processes.Strong troubleshooting and problem-solving skills in distributed systems.Excellent communication skills with the ability to work cross-functionally with product, engineering, and operations teams.A degree in Computer Science, Engineering, or a related field is preferred, though relevant experience is valued.Nice to Have:Experience with service mesh technologies (e.g., Istio).Background in microservices architecture and its challenges.Familiarity with security best practices in cloud-based systems.Seniority levelMid-Senior levelEmployment typeFull-timeJob functionInformation TechnologyIndustriesStaffing and Recruiting
#J-18808-Ljbffr