AI/ML DevOps EngineerOur Infrastructure AI and Data Engineering Team is responsible for providing the foundational firm-wide AI Enablement platform. We are transitioning this platform onto K8s and we are seeking an experienced DevOps Engineer to lead this effort. The ideal candidate will help drive our cloud-native infrastructure initiatives and lead the implementation of DevOps best practices across our organization. This is a unique opportunity to not only join one of the leading hedge funds in the world, but to provide leadership on the core AI Enablement platform which is used by every aspect of the business on a daily basis.Key Responsibilities:Design and implement high-availability solutions for critical AI infrastructurePartner with AI/ML teams to optimize platform performance and scalabilityDrive architectural decisions for the next generation of the AI platformLead the development and maintenance of CI/CD pipelines using tools like Jenkins or GitHub ActionsArchitect and implement Infrastructure as Code (IaC) solutions using Terraform or similar toolsOptimize container orchestration platforms (Kubernetes) and microservices architectureImprove and maintain monitoring, alerting and incident response systems (Datadog, OpsGenie)Lead incident response and participate in on-call rotationMentor junior team members and contribute to technical documentationCollaborate with development team to improve deployment processes and system reliabilityRequired Qualifications:5+ years of experience in DevOps, Site Reliability Engineering, or similar rolesStrong experience with cloud platforms (AWS/GCP/Azure)Expert knowledge of containerization (Docker) and orchestration (Kubernetes and Helm)Proficiency in Infrastructure as Code and configuration management toolsExperience with high-performance, low-latency systemsTrack record of successfully delivering large-scale infrastructure projectsExperience with CI/CD tools and methodologiesDeep understanding of networking, security, and system architectureExcellent troubleshooting and analytical skills.Strong communication skills to collaborate with various stakeholdersPreferred Qualifications:Experience in financial services or hedge fund environmentExperience with Python (FastAPI)Knowledge of machine learning operations (MLOps)Experience with data processing frameworks and big data technologiesExperience with MultiCloud and/or On-Prem KubernetesExperience running CUDA-enabled accelerated workloads