Job Description:
We are hiring for a dynamic new initiative for CVS Health, a pioneering business transforming healthcare in the United States by making customer experiences more seamless, convenient, and personalized.
CVS Health is focused on driving business agility and growth through technology, data, digital, and experiential innovations. We aim to provide a new state-of-the-art flexible work environment in our Galway facility at Bonham Quay to support these objectives.
Careers with CVS Health offer flexible work arrangements, and individuals who live and work in the Republic of Ireland will have the opportunity to divide their time between our Galway office and their home office.
Responsibilities:
* Design and scale data pipelines for logs, metrics, and traces
* Develop custom software to drive the observability platform using technologies such as Java Spring Boot, Node JS, Golang etc.
* Help engineers, who are primarily our customers to the Observability Platform, in troubleshooting the issues with Observability Platform instrumentations
* Implement OTEL client libraries for technologies such as Java Spring Boot, Node JS, Go-lang etc.
* Participate in team 24/7/365 on-call rotations to ensure the health and stability of the Observability Platform
* Manage CI/CD pipelines for deploying and managing observability platform infrastructure to Kubernetes
* Deliver an exceptional customer experience by engaging with platform customers as they reach out with support questions
* Create comprehensive documentation for observability tools and technologies
* Watch the watchers by building and managing instrumentation and alerting for the Observability Platform itself to deliver a highly available platform
* Work closely with the SRE team to understand application team challenges in the observability space and identify opportunities to improve the Observability Platform to meet these challenges
Requirements:
* 7+ years of experience in software engineering and/or site reliability engineering roles
* 7+ years of hands-on development experience with modern microservices using technologies such as Java Spring Boot, Go-Lang, Node JS etc.
* Strong exposure to cloud platforms such as GCP, AWS or Azure
* Strong familiarity with observability patterns and best practices including concepts like SLAs, SLOs, and SLIs
* Experience creating custom dashboards and views to understand system health and availability
* Extensive experience with modern infrastructure tooling like Docker, Kubernetes, Argo CD, Envoy/Istio
* Comfort using the Grafana Labs OSS stack: Loki, Grafana, Tempo, Mimir, et al.
* Excellent technical communication skills
Preferred Qualifications:
* Understanding of the Open Telemetry ecosystem, OTLP, and OTel Semantic Conventions
* Experience designing and scaling distributed systems
* Background in building and operating high-traffic backend services
* Familiarity with popular data-oriented open source technologies like Kafka and Postgres
Education:
Bachelor's degree or equivalent experience.