Principal Tech Ops Engineer: As a principal member of the Tech Ops SRE team, you'll work closely with our engineering partners to help enable and drive initiatives from design to implementation.
Our highly available multi-region Kubernetes (AWS EKS) environments are best-in-class and central to our enterprise-grade infrastructure strategy.
These growing environments currently support numerous mission-critical workloads.
In this exciting role, youll have the opportunity to further develop and refine your skills, collaborate across numerous teams, and continue to grow in a fun, collaborative, and rapidly changing environment.
This is a phenomenal opportunity to have a direct impact on the emerging strategies of our infrastructure and deployments, while at the same time, helping enable the expansion of our business.
The Skills and Expertise You Bring 5+ years of hands-on experience with AWS in a production environment Experience building and deploying Docker images Experience migrating applications from other container orchestration solutions (such as ECS) to Kubernetes preferably on EKS Production experience running Kubernetes workloads ideally on AWS EKS Experience managing and maintaining Kubernetes Clusters on AWS EKS Experience with Confluent or Kafka Experience creating and deploying Helm charts & libraries Hands-on experience with Jenkins Core, including authoring and maintaining declarative CI/CD pipelines and libraries Experience with monitoring tools e.g., Cloud Watch, Datadog & Splunk Cloud Proficiency with UNIX operating systems and shell scripting Experience with Amazon Web Services (AWS), having managed services and applications in a large AWS cross-account environment using IAM and federated SSO Experience crafting and maintaining logging, monitoring, and alerting capabilities using tools like Datadog and Splunk Ability to communicate at all levels with track record of strong written and verbal communications See problems as opportunities to automate Ability to work independently with minimal direction Drive and champion the overall design of highly available, secure, scalable microservices-based applications in AWS Track record of providing technical leadership to strong teams of Site Reliability Engineers / Cloud Engineers Experience with configuring and deploying resilient infrastructure in multiple regions and multiple availability zones Skills: Unix AWS AWS EKS Docker Jenkins Core